December, 1944 


ournal of Applied Psychology 


pITED BY: DONALD G. PATERSON, UNIVERSITY OF MINNESOTA 


Consulting Editors 


ut S. AcHILLES, Psychological Corporation; WALTER V. BINGHAM, A.G.O., War Department; 
gop E. Burtt, Ohio State University; ARTHUR I. GaTEs, T. C. Columbia University; 
un G. JENKINS, University of Maryland; Invinc LorGe, T. C. Columbia University; 
nN MCNEMAR, Stanford University; WILLARD C. OLSON, University of Michigan; 
ves P. PoRTER, Ohio University; EDWARD K. STRONG, JR., Stanford University; 
ornis S. VITELES, University of Pennsylvania; JoserH ZuBIN, N. Y. Psychiatric Institute. 





Table of Contents 


port on the “Classification Inventory.” A Personality Test for Industrial 
Use: C. E. JURGENSEN 445 


ocating the Troublemaker With the Guilford-Martin Personnel Inventory: 
H. G. Martin 461 


he Performance of Adult Female Applicants for Factory Work on the Likert- 
Quasha Revision of the Minnesota Paper Form Board Test: 


E. F. BALDWIN AND L. F. Smita 468 


Study of the Effect of the Presence of the Examiner upon Test Scores in 
Individual Testing: W. E. BINGHAM, JR. ............ cece e eee eecees 471 


fotion and Time Study. A Resumé and Bibliography: J. E. Zerca 

he Life Insurance Sales Research Bureau: S. Hasse 

Study of Relationships to Somatotype: D. W. Fiske 

nterest in and Value of College Courses: A. Q. SARTAIN AND E. G. Wartne 520 


Discussion of Dorcus’ Study of the Humm-Wadsworth Temperament Scale: 


D. G. Hum 527 
ews and Notes 





blished Bi-monthly by The American Psychological Association, Inc. 
ith the Cooperation of The American Association for Applied Psychology 
ince and Lemon Sts., Lancaster, Pa., and Northwestern University, Evanston, Illinois 


d as second-class matter, August 19, 1943, at the post office at Lancaster, Pa., under the Act of March 3, 1879 
Copyright, 1944, by The American Psychological Association, Inc. 


2 














Journal of Applied Psychology 








iber, 1944 





Vol. 28, No. 6 Decen 





Report on the “Classification Inventory,” a Personality 
Test for Industrial Use 


Clifford E. Jurgensen 
Kimberly-Clark Corporation, Neenah, Wisconsin 


Industrial executives and supervisors are in general agreement that 
personality is one of the most important factors in industrial success, the 
term ‘‘personality”’ being used in a wide sense including temperament, 
interests, attitudes, habits, modes of reaction, disposition, sentiments, 
effect on other persons, etc. A definition which adequately states what 
industrialists essentially mean by “personality” is that of Allport: ‘‘Per- 
sonality is the dynamic organization within the individual of those 
psychophysical systems that determine his unique adjustment to his 
environment” (1). 

Resulting from the emphasis which industrialists place on personality, 
many companies have used various personality tests in the employment 
of applicants. A few companies have accompanied such use with care- 
fully controlled experimental work, but most have more or less uncritically 
accepted or rejected personality tests on subjective opinions or a few case 
history reports. In an article of this type it is impossible to review 
adequately the industrial use of personality tests; suffice to say that such 
tests have generally been found to be of little value in the hiring of 
applicants when tests have been submitted to objective checks. 

A major objection to the industrial use of existing personality tests is 
that items are such that even a person considerably below average in 
intelligence can predict the “right’’ answer; as for example, in the fre- 
quently used question: ‘Do you daydream frequently?” Personality 
tests of this type have given rise to the facetious remark that only those 
applicants who are feebleminded or psychotic will fil] in the blank hon- 
estly, and that tests are not needed to determine occupational suitability 
of such persons. Whether an individual will be honest in his replies to 
such items depends on what he believes he has at stake. If a person is 
paying a fee to a clinical psychologist for help and guidance, he will 
probably try to fill in the blank honestly. If applying for a job, how- 
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ever, he has strong motivation to give those answers which he believes 
will be considered by the prospective employer as being the “best” 
answers. Even in the counseling situation where the testee is trying to 
answer the items honestly, it is frequently found that rationalizations and 
defense mechanisms influence answers to a considerable extent. 

A second difficulty in the use of personality tests is that the testee 
frequently finds he cannot answer honestly even though he so desires. 
Most questionnaires require a dichotomous response of yes or no, or 
require a response of yes, ?, or no. Most persons do not consistently 
respond in a given way in all types of situations, or even consistently in 
one type of situation, and so in many cases the testee may well believe 
that neither a yes or a no response is correct. A response of ? is difficult 
to interpret for it may mean that the applicant does not know what his 
answer should be, does not understand the meaning intended in the item, 
or that neither a yes or no response is correct. A few personality tests 
have utilized degrees of frequency such as always, usually, sometimes, 
seldom, and never; or degrees of extent such as hilarious, cheerful, good 
humored, dispirited, and morbid.! The use of degree avoids some of the 
disadvantages of a yes-no or yes-?-no response, but introduces the addi- 
tional ambiguity resulting from individual differences in tendency to 
think in terms of extremes. Connotation of degree varies for different 
persons. 

A third major disadvantage of personality tests in so far as their use 
in industry is concerned is that they have been developed to measure 


various “traits” or “components.’’ Thus there are many tests to meas- ’ 


ure dominance, extroversion, aggressiveness, paranoid tendencies, home 
adjustment, self-confidence, etc. The assumption is too often made that 
if a test is a valid measure of whatever trait it is intended to measure, 
it can safely be used in any situation wherein that trait is thought to be 
important. For example, if a test is considered a valid measure of extro- 
version, it is frequently believed (uncritically) that it will be valid in 
predicting success as a salesman. Even when experimental work is done 
to test the hypothesis that certain personality traits are related to occu- 
pational success, results must frequently be expressed in terms of a 
regression equation which is time consuming when used with a large 
number of applicants and which conceals the degree of presence or ab- 
sence of various components. From the practical viewpoint of the 
employment situation, the trait being measured is unimportant. The 
major requirement is that the test be a reliable measure of something, 


1 As previously pointed out, an inventory or rating scale consistently combining 
degrees of frequency of behavior and degrees of type of behavior results in more mean- 
ingful results than either of these degrees used alone (3). 
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whatever that something may be, which is closely related to occupational 


success. 

A fourth disadvantage of personality tests is that they have usually 
been validated on students in high school or university or on extreme 
groups deviating significantly from those considered normal. Thesé 
groups, however, make up only a very small percentage of industrial 
applicants. This, also, may partially account for failure of existing per- 
sonality tests when used in the industrial. situation. 

A personality test which is intended to meet successfully the needs of 
industry should fulfill the following major requirements: 

1. The applicant must not be able to predict the “right” answers 
which will be most favorable in his attempts to secure a job. 

2. Items must be worded in such way that the applicant be able to 
give an answer. There should be no forcing of replies into a dichotomy 
of yes or no, there should be no ambiguous ?, and there should be no 
degrees which permit variable interpretations. 

3. The test should be scored and be validated on specific jobs rather 
than “‘traits.’”” Consequently, the test should have different scoring keys 
for different companies and for different jobs within a single company. 

4. The test should be validated on a population similar in all respects 
to the population for which it is subsequently to be used as a selection 
device. 

The Classification Inventory reported here is an attempt to meet the 
four requirements given above for a personality test which is useful for 
purpose of predicting industrial job success. 


Development of Experimental Forms 


Consideration of the first two of the above requirements suggested 
use of the psychophysical methods of ranking or paired comparison. By 
a careful selection of items, such as by comparing favorable traits with 
other favorable traits, and unfavorable characteristics with other un- 
favorable characteristics, these techniques can be used to decrease the 
likelihood of “beating” the test by correctly predicting the “right” 
answers. 

The first step in the development of the Classification Inventory was 
to compile items suitable for a test of personality which meets the pre- 
viously mentioned criteria. The extensiveness of the compiled list made 
it impractical to place the items in paired comparison form. For ex- 
ample, Form X-1 contained 559 items which would have given 179,101 
choices if placed in all possible combinations. The procedure. followed 
was to place items in groups of five and require the testee to rank them 
from 1 to 5. Thus for each set of five items ranked by the testee, ten 
paired comparisons could be used for statistical computations. 
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Items were placed in groups of five on the basis that, (1) items should 
be such that testees can make a choice between them, and (2) items 
should be grouped in such way that testees can see no ranking which 
logically appears to be “‘better’’ than any other ranking. 

Four revisions were made on the basis of suggestions of personnel 
executives and employment men, and on experimental tryouts on smal] 
groups of individuals. The fourth form was administered to 96 persons 
employed by approximately 20 companies. Subjects ranged from hourly 
paid production workers to top executives, and were approximately 
equally divided between men and women with ages ranging from 20 to 52. 
Data were analyzed by retabulating in paired comparison form, ten com- 
parisons being made for each group of five items. 

In analysis of the data, degree of preference was taken into consid- 
eration as well as the items preferred. This was done by noting the 
number of items given an intermediate rank between the two items under 
consideration; in other words, by subtracting the lower rank from the 
higher. For example, when comparing ‘‘Be calm” with ‘Be cheerful,” 
the degree of preference was called 1 if there were no intermediate rank- 
ings, 2 if the items were separated by one other item, and so on to 4 if 
ohe was given a rank of one and the other a rank of five. The minimum 
degree of preference for the 96 subjects was 96, and the maximum was 
384. These figures do not take into consideration which of the two items 


was preferred, as the ideal arrangement of items is such that for any pair, 
both items are equally often selected as preferred, and both items are 
widely separated in rank order. Such ideal would assure items which 
differentiate between individuals and yet (because of the distance be- 
tween them) are reasonably reliable. For illustrative purposes results 
from one group of five items is given in Table 1. 


Table 1 
Method of Analyzing Results from Form 4 





Number Number 
Preferring Preferring Degree of 
Items Compared 1st 2nd Preference 





calm—cheerful 30 66 170 
calm—alert - 32 64 192 
calm—dignified 74 22 170 
calm—friendly 72 210 
cheerful—alert 52 176 
cheerful—dignified 12 202 
cheerful—friendly 74 124 
alert—dignified 10 234 
alert—friendly 54 198 
dignified—friendly 88 250 
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The analysis discussed here was made for a dual purpose: to eliminate 
poor items (not answered differently by various persons, or not being a 
sifficient distance apart for reliability) and to simplify the test in this 
process Of elimination to make it less time consuming in administration 
and easier to score after weights were developed. From the scoring point 
of view, the simplest procedure would have been to have all items in 
paired comparison form. However, the use of three items to be ranked 
in order of preference also gives results which are readily scorable and 
has the further advantage of giving two paired comparisons for each three 
items, as contrasted with two comparisons for each four items placed in 
paired comparison form. The goal at this point in the analysis was 
therefore to select three items from each group of five, selecting items all 
of which were ranked 1, 2, and 3 by different persons, and which were 
psychologically sufficiently apart to assure reliability. 

Results were tabulated in the form given in Table 2 (using the same 
items as previously used for illustrative purposes). Letters refer to items 
as follows: a, Be calm; b, cheerful; c, alert; d, dignified; e, friendly. The 
first figure in each cell gives the degree of preference, and the second gives 
the number of persons preferring the item preferred by the majority. 


Table 2 
Method of Tabulating Data for Item Selection 





a b d 





~_ 
‘ 


zo)eo - 
92/64 176/52 —_ 


170/74 202/84 234/86 — 
210/72 124/74 198/54 250/88 





In selecting items for the revised test, a degree of preference as close 
as possible to the theoretical maximum of 384 was desirable. (This 
maximum could be reached, of course, for only one of the ten possible 
comparisons.) No rigid selection standard was set inasmuch as numeri- 
cal interpretation is vague, and the purpose of the degree of preference 
index was merely to increase the reliability of the final items. The ideal 
number of persons preferring each of the two compared items would be 
48 preferences for one item and 48 preferences for the other. An arbi- 
trary standard of no more than 74 persons (75%) preferring either of the 
items was established. 

On the basis of eliminating items preferred by more than 75% of the 
persons, items bd, cd, and de in Table 2 were eliminated. In all of these 
cases SO Many persons preferred one of the two items that the item did 
hot sufficiently discriminate between individuals. Further, use of any 
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item answered similarly by a large percentage of subjects might result jy 
use of an item the “right’”’ answer of which could be predicted. 

There were ten possible combinations of three items which could he 
selected from each group of five items, namely: abc, abd, abe, acd, ace. 
ade, bed, bce, bde, and cde. For the same group of items previously used 
for purpose of illustration, six of these possibilities were eliminated be. 
cause more than 75% of the subjects had preferred one of the items 
These were abd, acd, ade, bed, bde, and cde. The remaining four items 
satisfied the criterion of 75% or less responding in a single way. These 
remaining four items were then compared on the basis of degree of pref- 
erence. They were—abc: 170, 192, 176; abe: 170, 210, 124; ace: 192, 219, 
198; bce: 176, 124, 198. Of these four remaining possibilities the best 
was a combination of items ace, inasmuch as the degree of preference was 
greatest for this particular combination. 

Similar analyses were made for each of the 170 groups of five items 
each. Twelve items in one group which were ranked from 1 to 12 were 
analyzed by means of a 12 row 12 column classification to select as many 
as possible three item combinations with as little as possible overlapping 
of items. ; An additional 62 items in paired comparison form were ana- 
lyzed solely from the point of view of number of persons making each 
response, and all pairs were eliminated in which one of the two items was 
preferred by more than 75% of the subjects. 

Analysis showed that in some of the groups more than one selection 
of three items each could well be made from the total of five items. In 
such cases, one of the items had to be repeated in each set of three. 
Such repetition was undesirable (inasmuch as future subjects might 
incorrectly believe that entire groups were repeated rather than a single 
item being compared with different pairs) but was permitted in those 
cases where the degree of preference was unusually large. Other cases 
were found where one or more pairs could be selected, but no adequate 
group of three items was present. In such cases pairs were selected and 
added to the previously used paired comparison items. 


Development of Present Form 


The revised form was administered to 133 persons including 63 mis- 
cellaneous individuals, 40 technical salesmen in the Kimberly-Clark 
Creped Wadding Division, and 30 graduate students working for their 
Ph.D. at an institution limited to graduate work in a specialized phase 
of science.? Fifteen of the 63 miscellaneous persons filled in the blank 
on two occasions, approximately one month apart. 


? Administrative officials of the school concerned prefer that the institution remain 
anonymous. The author gratefully acknowledges appreciation for permission to pub- 
lish results obtained from their use of the Classification Inventory, and respectfully 
follows their desire to remain anonymous. 
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The 40 technical salesmen were rated on sales ability by their Sales 
Manager and Assistant Sales Manager. Each rater made three inde- 
pendent ratings on each man, ratings being made one month apart. 
Reliability of the combined ratings was .908. For determination of item 
weights the criterion group was divided into upper and lower halves. 

Several weighting techniques were tried, both weighted scores and 
unweighted scores by the Guilford (2), Kelley (4), and critical ratio (5, 6) 
technique. With each of these techniques weights were assigned by 
means of an ABAC. 

The 30 graduate students were rated by faculty men on the basis of 
probable professional success in industry. The number of ratings ob- 
tained on each man ranged from 5 to 20, with a mean of 13.8 ratings per 
man. No reliability for these ratings was obtained inasmuch as each 
rater rated different men and a different number of men. Grade point 
averages were also available for these men. Ratings and grade point 
averages were converted to standard scores and averaged. The corre- 
lation between ratings and grade point average was .652. A figure of 
789 was obtained by stepping up this correlation by the Spearman- 
Brown prophesy formula to estimate the reliability of the combined 
criterion. This represents the combined criterion in so far as ratings on 
probable success can be considered the same as grade point average. 
Such similarity appears very improbable. The reliability of the criterion 
should, therefore, be considered as no lower than .789 and is greater than 
this in so far as ratings and grade point averages reflect different factors. 
Weights for the graduate student key were determined by the same pro- 
cedures as were weights for technical salesmen. 

Use of the inventory showed that it was too long for practical indus- 
trial use. Although the time was not excessive (requiring approximately 
40 minutes), subjects complained of monotony resulting from similarity 
of content and arrangement. It was believed that a shorter inventory 
was preferable, and that the decrease in reliability which usually accom- 
panies a reduction in test length would be offset by greater care in filling 
in the inventory. 

In reducing the length of the test, four factors were given considera- 
tion: (1) the per cent of persons who favored one item over the others, 
(2) the amount of inconsistency shown in each item by persons who had 
filled in the inventory twice, (3) the number of scoring weights obtained 
for each group of items on the two criterion groups, and (4) the number 
of times an item was repeated with either identical or similar wording 
or Meaning. 

The procedure for eliminating items was comparatively subjective. 
Many items were good from some points of view and poor from others. 
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Each factor had to be considered in relation to the others and no rigid 
standards could be utilized. 

The present form contains 45 groups of three items each and 55 paired 
comparison forms. Instructions require the subject to ‘Read the item: 
in each group and decide which you think is Best and mark by encircling 
the “B” appearing on the answer sheet before that item. Then decide 
which you think is Worst and mark by encircling the ‘““‘W” appearing on 
the answer sheet before that item. The remaining item should then be 
marked “‘N” to indicate that it is neither best nor worst.” 

As an illustration of the type of item included, following are the first 
four test groups of three items each and the first four test items in paired 
comparison form: 


BN W People who have little control over their tempers. 
BN W People who think they are better than other persons. 
BN W People who crow over winning a game. 


BN W_ Acquire tuberculosis. 
BN W_ Acquire heart trouble. 
BN W _seLose your eyesight. 


B W Be calm. 
B Ws Be alert. 
BNW Be friendly. 


BN W Faint in public. 
B W Lose your job. 
B W ss Have your friends lose confidence in you because of untrue rumors. 


B Appear conceited. 
B Appear fidgety. 


Persens who always interrupt when you are talking. 
Persons who pick their teeth. 


Considered nervous. 
Considered stubborn. 


Considered pleasant. 
Considered punctual. 


Scoring of Test Items 


Numerous weighting techniques are available to test constructors, 
most of the methods involving division of the criterion group into two 
subgroups, determining the percentage of persons within each subgroup 
who made similar responses to each item, and determining on the basis of 
these percentages what weight, if any, should be given the item under 
consideration. The test set-up is such that each item is marked B, N, 
or W (except paired comparison items in which one of the two items is 
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marked B). Weighting, therefore, was determined on the basis of per- 
centage of ‘‘good”’ persons who encircled B for a given item as compared 
with the percentage of “‘poor” persons who marked that same item B. 
Similar comparisons were made for N and W replies to the same item. 
A suitably designed chart was used to facilitate these tabulations. 


The first weighting technique used was one introduced by Guilford (2), 
which is proportional to the correlation between the item and criterion and 
inversely proportional to the variance of the item. Guilford’s formula gives 
weights ranging from 0 to 8, with a weight of 4 when there is no significant 
difference between the criterion subgroups. Because of the large number of 
items which did not differentiate the criterion subgroups the formula was used 
without adding 4. This gave a range from —4 to +4 with a weight of 0 when 
there is no real difference between the percentages, thus eliminating consider- 
able work when computing test scores. The Guilford weighting technique 
was also modified to also give unweighted scores based on plus, minus, or zero 
by considering all weights of +1 or more as plus and all weights of —1 or 
more as minus. 

The second weighting technique used was one which is ascribed to Kelley 
4). In short, the technique consists of determining the proportion of upper 
criterion subgroup responding in a specified manner and finding the corre- 
sponding Z score value from appropriate normal probability tables. The Z 
score value is similarly found for the proportion of the lower criterion sub- 
group responding to the same item in the same manner. The weight is ob- 
tained by subtracting the second of these Z scores from the first. 

A third weighting technique used was the ¢ ratio (or critical ratio) tech- 
nique where the difference between proportions of the two criterion subgroups 
is dividéd by the standard error of the difference. 

The t or critical ratio technique has the advantage seldom found in other 
weighting techniques of automatically indicating the significance of the differ- 
ence between the two groups. This advantage is particularly important when 
dealing with a small N. In spite of this great advantage, the ¢ ratio is seldom 
used, primarily because of the large amount of statistical work required in 
using the formula. 

When dealing with unweighted scores, a test constructor need not be con- 
cerned with the exact size of a ¢ or critical ratio if he knows whether or not the 
ratio exceeds that required for a given degree of significance. For use in such 
eases the author has constructed an ABAC which is entered by the two pro- 
portions and which immediately indicates whether or not the difference is of 
the desired degree of significance. 


All three of these weighting techniques gave essentially the same re- 
sults, all intercorrelations being above .95. Ratios differentiating the 
upper and lower criterion subgroups based on total scores were all above 
9.8 for the group of 40 technical salesmen and above 6.3 for the group 
of 30 graduate students. The ¢ ratio technique was selected for test 
scoring inasmuch as it is no more difficult to obtain when an ABAC is 
used, and carries its own indication of significance of the difference 
between the two criterion subgroups. 

When deciding upon levels of significance to be used for accepting or 
rejecting test items, a test author must avoid two equally undesirable 
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extremes. If too high a level is used, many valid items will be rejected 
as having resulted from chance and test reliability will be decreased due 
to the resuitant small number of items which are weighted. On the 
other hand, use of too low a level of significance results in weighting 
items which will prove to be invalid in subsequent follow-up studies. 
It has become more or less standard practice to interpret differences aj 
the 5% level of significance as significant, and differences at the 1% level 
as very significant. It appears to the author, however, that these re- 
quirements are too high for selection of individual test items, particularly 
when upper and lower halves of a criterion group are being used. If g 
specified level of significance is considered satisfactory for differentiation 
between upper and lower quarters, a lower standard is equally satis- 
factory when dealing with upper and lower halves. This is true because 
the size of an obtained ratio depends in part on the magnitude of the 
difference between the two groups, and any item having a validity coeffi- 
cient other than .00 will result in greater differences in widely separated 
groups such as upper and lower quarters than in adjacent groups such as 
upper and lower halves. 

For use with the Classification Inventory, levels of significance of .40, 
.10, and .01 were selected for weights of 1, 2, and 3 respectively. The 
chances of there being a true difference in the direction indicated are 
therefore 4:1, 19:1, and 199:1, respectively. Weights were either plus 
or minus depending on the direction of the difference between the two 
criterion subgroups. To some it may appear that scoring weights as- 
signed to differences significant at as low a level as .40 is unusually 
lenient. Comparison of ABACs involving various methods of item 
selection will show that this is not the case when criterion groups consist 
of a small number of persons. The ¢ ratio technique (with a level of 
significance of .40) is particularly more rigorous than other weighting 
methods when one or both of the p’s approximates either .00 or 1.00. 

Scoring keys have been prepared so that all items having a level of 
significance of .40 or better are on one key (known as Weight 1), items 
having a level of significance of .10 or better on a second key (Weight 2), 
and items having a level of significance of .01 or better on a third key 
(Weight 3). This arrangement permits investigation of the adequacy 
of scoring tests at any one of these three levels of significance as well as 
any combination of these weights. For example, by adding separate 
scores obtained from keys indicating Weight 2 and Weight 3 a score is 
obtained in which items having a level of significance between .10 and 
.01 are weighted 1, and items having a level of significance of .01 or 
better are weighted 2. Using the notation of levels of significance of 
40, .10, and .01 referring to keys 1, 2, and 3 respectively, the scoring 
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combination just referred to is called W2+W3. Other combinations of 
scoring weights are notated similarly. The following seven simple and 
multiple scoring weights have been investigated for each of the two 
scoring keys developed to date: 


Wl, W2, W3, W1+W2, W1+W3, W2+W3, and W1+W2+ W3. 


Intercorrelations between these various weighting methods were com- 
puted for the following four groups: (1) Criterion group of salesmen, 
2) Criterion group of graduate students, (3) 143 miscellaneous indi- 
viduals scored on salesman key, and (4) 153 miscellaneous individuals 
scored on the graduate student key. Intercorrelations are given for these 
four groups in Tables 3, 4, 5, and 6 respectively. As might be expected, 


Table 3 


Means, Sigmas and Intercorrelations of Various Scoring Methods for Criterion Group, 
Salesman Key (N = 40) 





Wi1+W2 
wi W2 W3 W1i+W2 W1+W3 W2+W3 +W3 





Mean 17.85 6.75 .78 24.60 18.63 7.53 25.38 
Sigma 19.44 12.79 4.42 31.98 23.58 17.00 36.15 


Wi — 

W2 .968 

W3 .923 : _— 

W1+W2 .995 . .935 _ 
Wi+W3 .997 J .948 .996 
W2+W3 .969 d .964 .987 
W1+W2+W3 .993 é -950 -999 





Table 4 


Means, Sigmas and Intercorrelations of Various Scoring Methods for Criterion Group, 
Graduate Student Key 





W1+W2 
Wi W2 W3 W1+W2 W1+W3 W2+W3 =+W3 





Mean 15.23 4.10 —.97 19.33 14.27 3.13 18.37 
Sigma 24.51 14.50 5.10 38.72 29.06 19.26 43.33 
Wi _ 

W2 .968 

W3 ‘ 

W1.-W2 .996 

Wi+W3 996 

W2+W3 .960 

W1+W2+W3 
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Table 5 


Means, Sigmas and Intercorrelations of Various Scoring Methods for 143 Miscellaneoys 
Persons on Salesman Key 





Wi+ We 
Wi W2 W3 W1+W2 W1+W3 W2+W3~ = +W3 





Mean 14.59 4.23 — .22 18.82 14.36 4.01 18.59 
Sigma 11.06 6.92 3.24 17.48 13.38 9.57 19.88 


Wil — 

Ww2 .883 — 

W3 647 .739 2 

W1+W2 .983 955 .702 

W1+W3 .983 908 .776 

W2+W3 857 .974 873 
W1+W2+W3 .969 -960 -780 .993 





Table 6 


Means, Sigmas and Intercorrelations of Various Scoring Methods for 153 Miscellaneous 
Persons on Graduate Student Key 





W1+W2 
Wi Ww2 W3 W1+W2 W1+W3 W2+W3 +W3 





Mean 11.88 3.96 —.71 15.84 11.18 3.25 15.14 
Sigma 11.79 7.61 3.04 18.80 13.74 9.97 20.86 


Wil 

W2 871 — 

W3 .063 695 — 

W1+W2 .980 951 .635 — 

W1+W3 983 901 -705 .982 — 

W2+W3 .984 .976 836 -920 .903 — 
W1+W2+W3 965 -958 .718 .994 .987 951 





all intercorrelations are exceedingly high for the two criterion groups. 
They are likewise high for the heterogeneous groups except for inter- 
correlations involving Weight 3 which are considerably lower than the 
others. Inspection of sigmas shows, however, that variability is very 
limited with Weight 3.as a result of the comparatively few test items 
which are significant at this level. Such restriction of range should be 
expected to result in smaller correlation coefficients. 


Validity 


Validity has been determined on two groups, a different scoring key 
being used with each group. These groups consisted of 40 technical 
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salesmen and 30 graduate students. Pearson validity coefficients have 
been computed as well as the ¢ ratio obtained by dividing the difference 
between the upper half and lower half by the standard error of the 
difference. Results (given in Table 7) indicate that so far as validity is 
concerned the various scoring methods are approximately equally good. 


Table 7 


Validity of Classification Inventory for Two Groups 





Pearson Correlation t Ratio 





Graduate Graduate 
Score Used Salesmen Students Salesmen Students 





wi . .684 11.4 
Ww2 .673 10.6 
Ww3 ‘ .670 10.1 
Wi+W2 196 .685 11.3 
Wi+Wws3 F .694 11.5 
W2+W3 ‘ .684 11.0 
W1+W2+W3 801 715 11.4 


NIT ys 
ar Dre WO 


77 





Split half reliability cannot be obtained in the usual manner because 
many of the test items are unweighted on one or the other key. How- 
ever, a modified split-half reliability was obtained by correlating all the 
odd scored responses with all the even scored responses and stepping up 
by the Spearman-Brown formula to give an estimated total test relia- 
bility. Reliability coefficients are given in Table 8 for the salesman 
criterion group on the sales key, 143 miscellaneous individuals on the 
sales key, student criterion group on the graduate student key, and 153 
miscellaneous persons on the graduate student key. 


Table 8 
Reliability Coefficients 





W1+W2 
Key Group N Wi W2 W3 W1+W2 W1+W3 W2+W3 +W3 
Sales Criterion 40 .972 .905 .900 .940 .925 .942 .949 
Sales Misc. 143 .737 .586 .809 -796 815 797 .850 





Student Criterion 30 .954 .934 .803 .975 .974 .940 .981 
Student Misc. 153 .721 .663 .398 .842 .827 .748 .886 





Inspection of Table 8 shows that so far as criterion groups are con- 
cerned, reliability is exceedingly high for both the salesman and student 


’ For brief discussion of criteria used and reliability of criteria see section on ‘‘De- 
velopment of Present Form.”’ 
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keys; and comparatively little difference exists in reliability among the 
various methods of combining weights. It must be remembered, how- 
ever, that the use of item weights tends to maximize total score differ- 
ences between upper and lower criterion groups, and it cannot be expected 
that scores of the criterion group will be comparable to those obtained 
from additional groups. A primary difference will be in a restricted 
range of scores to be expected from additional groups similar to the 
criterion group, and this difference will usually result in lowered relia- 
bility coefficients when comparing those obtained from the criterion group 
with those obtained from subsequent groups. The decrease in range of 
scores from the criterion group to a subsequent group can be seen by 
comparing the sigmas given in Table 3 with those in Table 5, and com- 
paring sigmas in Table 4 with those in Table 6. The effect of such 
restricted ranges on reliability coefficients can be seen in Table 8 where, 
without exception, the reliability was higher for the criterion group than 
for a subsequent group, In the case of some of the weighting methods, 
the drop is exceedingly great. 

Validities, as given in Table 7, show little difference between various 
scoring methods, as do also reliabilities of the two criterion groups. If 
consideration of data were limited to these figures the choice of weighting 
method would result in using the simplest method. Due to the drop of 
reliability coefficients in subsequent groups, however, the weighting 
method tentatively selected is W1+W2+W3. Opinions will vary in 
regard to whether or not the increase in reliability obtained by this 
method warrants the increased scoring time. Pending accumulation of 
additional data, such increase in reliability appears worthwhile to the 
author. 


Relationship Between Two Scoring Keys 


Coefficients of correlation have been computed between the salesman 
key and graduate student key. For the salesman criterion group the 
correlation is .281 + .10* (N = 40), for the graduate student criterion 
group the correlation is .218 + .12 (N = 30), for 113 miscellaneous per- 
sons (excluding all members of either criterion group) the correlation is 
.204 + .06, and for all persons who have filled in the inventory, including 
members of both criterion groups (N = 183), the correlation is .217 + .05. 
These correlations show some degree of overlap, which is only to be ex- 
pected, inasmuch as there are undoubtedly numerous personality factors 
which tend to make for success or failure in a large number (if not all) 
types of position. Nevertheless, correlations are sufficiently low to sup- 
port the hypothesis that there is no universally “good” personality, and 


‘ Errors given are probable errors. 
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that personality characteristics which are favorable for one type of 
position may be unfavorable for another. 


Summary and Conclusion 


The Classification Inventory reported here was devised as a test of 
personality which would be practical for use in the industrial employment 
situation. Items are such that a prediction of the “right’’ answer is 
difficult, thereby minimizing the giving of desirable answers either by 
intent or as a result of rationalizations or defense mechanizations. Most 
of the items are listed in groups of three, and the testee indicates which 
of the three items he considers best, which is worst, and which is inter- 
mediate. Other items are listed in pairs, one of which must be checked 
as being better. Items universally considered favorable or pleasant are 
compared with each other, and unfavorable items are similarly compared. 
This procedure eliminates use of a yes-no dichotomy neither answer of 
which may appear to the testee as being accurate in his case, eliminates 
use of an ambiguous ? and eliminates use of adjectives denoting degrees 
which adjectives permit of variable interpretations. 

The latest revised form of the Classification Inventory contains 245 
items, consisting of 45 groups of three items each and 55 paired com- 
parison forms. An answer sheet is used which permits the test booklets 
to be reused, which saves considerable filing space, and which greatly 
facilitates scoring of tests. 

Scoring weights were assigned test items by means of ABACs accord- 
ing to the Guilford, Kelley, and critical or ¢ ratio techniques. Weighted 
and unweighted total scores correlated exceedingly high with each other 
as did also scores obtained from each of the three techniques. The 
method finally decided upon was the ¢ ratio technique giving scores 
weighted plus or minus 1, 2, or 3 based on levels of significance of .40, 
10, and .01. 

The Inventory is intended to be scored and validated on jobs rather 
than traits. Two such studies have been completed. In both cases the 
Inventory shows satisfactory validity and reliability. Validation studies 
have been made on groups comparable to those on whom the Inventory 
is subsequently to be used. 

Data are now being accumulated for three more validation studies: 
(1) women supervisors in industry, (2) hourly paid men production 
workers, and (3) hourly paid women production workers. Other spe- 
cific studies are planned for the future, particularly with respect to men 
supervisors in industry. Obviously, similar studies could be made for 
any other occupational group. 
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In certain cases it may be advisable to have keys pertaining to per- 
sonality traits rather than specific occupations. Such keys will have 
little use for occupational selection, but may prove valuable for indus. 
trial counseling and guidance as well as similar guidance in educational] 
and clinical situations. Such keys will be developed when time and 
circumstances permit. 

If future research verifies that already done, the Classification Inven- 
tory will be released for general use. In line with the author’s conviction 
that the Inventory should be validated on specific jobs, release of the test 
will not include scoring keys. Detailed instructions, including ABACs. 
for developing scoring keys for specific situations will, however, be 
included. At the present time, however, the Inventory must be con- 
sidered as being in an experimental stage, requiring considerably more 
research before definite conclusions can be drawn other than its appear- 
ance of promise. 


Received November 15, 1943. 
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Locating the Troublemaker with the Guilford-Martin 
Personnel Inventory 


Howard G. Martin 


University of Southern California 


Among the troublemakers in industry are those workers who are at 
odds with their fellow employees and with their supervisors and whose 
negativistic attitude constitutes a source of friction and irritation in the 
production unit. The working force must function as a smoothly co- 
ordinated team in these days of highly specialized functions. The func- 
tion of each worker is often so dependent upon that of every other worker 
in the group that one or more malcontents who are unable or unwilling 
to fulfill their production roles can sabotage the efforts of the entire unit. 

Another undesirable aspect of the activities of the maladjusted work- 
ers is that certain types are capable of evoking negativistic reactions in 
their fellow employees who are susceptible. Many persons who are 
sufficiently cooperative and agreeable under normal circumstances have 
undesirable tendencies which they have learned to conceal and control 
effectively. But under the goading and stimulation of certain types of 
troublemakers they constitute fertile ground for the development of 
discontent. 

Another situation in which a sorehead is likely to ruin the morale of 
his unit and cause a drop in production is when he is in a supervisory 
position which allows him to indulge his inclinations at the expense of 
those under his supervision. Continual criticism, suspiciousness, and a 
domineering attitude on the part of a supervisor constitute a condition 
which may so frustrate and disturb those under him that workers who, 
under another man, would be satisfactory employees become discon- 
tented and make trouble. 

The psychological interpretation of troublemaking behavior is an 
application of the principles which explain how an individual adapts 
himself to his environment, how he learns and forms habits, and how his 
temperament traits develop. The course of a normal life is a series of 
adjustments in which each individual modifies his behavior in response 
to the combined situation created by his needs and by the opportunities 
of his environment (1). Certain basic human needs cannot always be 
satisfied immediately and directly. These thwarted needs impel the 
individual to find some means of satisfying them. Both desirable and 
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undesirable kinds of behavior are developed in an effort to satisfy these 
thwarted needs. 

The undesirable activities of the troublemaker are types of adjustive 
behavior which he has learned in order to satisfy his thwarted needs 
Jobs which are repetitious and dull, which lack prestige and social recog- 
nition, and which are hazardous and fatiguing thwart many of the 
workers’ needs and may constitute the precipitating causes of trouble. 
making behavior. By transferring the malcontent to new jobs in ap 
effort to find one which is not so frustrating for him and by carefy| 
counseling techniques he may be kept from hindering production. How- 
ever, when the labor supply is plentiful the most effective method of 
dealing with the problem in a single company may be to determine before 
they are hired which workers will later become troublemakers and hire 
only those who have the highest probability of becoming well-adjusted 
workers. 

Although the thwarting of the individual’s motives results in the 
undesirable adjustive behavior which comes under the heading of trouble- 
making, these frustrating experiences are merely the precipitating causes 
of these disgruntled activities. It is obvious that not all persons who 
are subjected to the same frustrating experience of working in a factory 
exhibit such undesirable behavior. The predisposing causes of malad- 
justment lie in the temperament traits of the troublemaker. An indi- 
vidual’s temperament can be defined in this connection as his tendencies 
to make certain kinds of adjustive responses to frustrating situations. 

If differences in temperament determine whether an individual will 
function effectively or develop into a malcontent in an industrial situa- 
tion in which a considerable degree of frustration is often inevitable, the 
problem of locating the troublemaker before he has an opportunity to 
make trouble is reduced to determining the temperament structure of 
each individual who applies for employment and hiring only those who 
possess the traits which constitute tendencies toward making desirable 
adjustments. The problem is to measure the basic variables of tem- 
perament with sufficient reliability to effectively weed out the undesir- 
ables and at the same time accurately locate the desirables so they can 
be hired. 

Before temperament can be accurately measured the basic variables 
must be isolated and defined. Factor analysis methods are now being 
employed to determine the statistically independent traits of tempera- 
ment. The application of factor methods to the paranoid area of tem- 
perament, which is believed to be of prime importance in the study of 
the industrial troublemaker, has yielded several traits. From the results 
of Guilford’s factor analysis (2), from a factor analysis by Mosier (6), 
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and from Johnson’s clinical analyses of temperament (5), the four basic 
variables of paranoid behavior appear to be as follows: suspiciousness, 
faultfinding, belligerence, and personal reference. 

Because it was found impractical to obtain a clear-cut separation, 
or even a reasonable degree of separation, between the first two aspects 
with the questionnaire items used, the suspiciousness trait was allowed 
to be submerged in both the faultfinding and subjectivity traits in prac- 
tical measurement. The three traits remaining, then, but named for the 
opposite, more favorable poles, are: 


('o—Cooperativeness—as opposed to faultfinding, or an overcriticalness of people 
and things. 

0 —Objectivity as opposed to personal reference or a tendency to take 
everything personally and subjectively. 

4q—Agreeableness as opposed to belligerence or a domineering attitude and 
an overreadiness to fight. 


More than 200 questionnaire items were constructed which were be- 
lieved to cover the area of behavior constituting these three traits. This 
list, stated in question form to be answered by either ‘‘yes,”’ ‘‘?,”’ 
was administered in California industrial concerns, business offices, and 


or “‘no,”’ 


civil service units to 250 men and 250 women workers, ranging from the 
unskilled factory laborer to the office clerical worker. The age range 
was from 20 to 45 and a minimum requirement of sixth grade literacy 
insured adequate reading ability of all subjects. An effort was made to 
secure 500 criterion cases that represented a truly random sample of the 
personnel population for whom the questionnaire was being designed. 

The items which had been shown by the factor analyses and clinical 
studies to have heavy loadings in a trait were included in the preliminary 
scoring key for that trait. Typical items from these keys are listed 
below: 


For trait Co— 


Do you believe that most people shirk their duties whenever they can without 
appearing to do so? 

Do you believe that only people with money can be sure of getting a square 
deal in courts of law? 


Do you feel that many young people get ahead today because they have 
“pull’’? 


For trait O— 


Do people near you sometimes whisper or look knowingly at one another when 
they think you are not noticing them? 

Are there some things about yourself concerning which you are rather touchy? 
Are you continually comparing yourself with other people? 
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For trait Ag 


Have you frequently wished for enough money or power to impress peopl 
who regard you as an inferior? 
Have you very much resented having friends or members of your family giv; 
you orders? 


Do you believe that most people require someone to tell them what to do’? 


Four hundred of the questionnaires were scored with the preliminar 
scoring keys thus constructed. The highest 100 and lowest 100 cases 
for each trait (extreme quarters of the distributions of scores) were used 
as criterion groups for the item analyses. Weights for scoring responses 
to the items were derived by a method devised by Guilford (3). Ea 
weight is directly proportional to the phi coefficient of correlation }) 
tween the response and the criterion and inversely proportional to | 
standard deviation of the response. The scoring keys contain weigh 
for 62 items for trait Co (Cooperativeness), 48 items for trait O (Obj 
tivity), and 38 items for trait Ag (Agreeableness). Only four items wer 
scored for more than one trait in order to keep intercorrelations of score: 
as low as possible. 

As a check on the reliability of scoring, the test papers of the remain- 
ing industrial cases were scored with these final scoring keys which, for 
the purpose, were divided into random halves of items. Pearsoniar 
coefficients of correlation were computed between the scores on these 
halves of the scoring keys and when corrected by the Spearman-Brown 
formula they become .91 for trait Co, .83 for trait O, and .80 for trait Ag 
The reliability of such questionnaires is, in part, a function of the num- 
ber of items scored and so the reliabilities correspond in rank order to 
the numbers of items in the scoring keys for the traits. 

As is usually true of inventories of this type, there were intercorrela- 
tions among the scorings. Previous experience has shown that scorings 
may be correlated even when traits are not, due to the fact that responses 
to items are not pure indicators of traits (4). The intercorrelations, all 
positive, between these three scoring keys were: Co and O, .55; Co and 
Ag, .63; and O and Ag, .64. The reliabilities of the scoring keys were 
regarded sufficiently high and the intercorrelations sufficiently low for 
testing purposes. 

Two of a series of experiments designed to indicate the degree of 
effectiveness of the Personnel Inventory in distinguishing troublemakers 
from satisfactory employees have been completed. The first validity 
experiment was made when the personnel department of an aircraft parts 
manufacturing concern in Southern California administered the Per- 
| sonnel Inventory to a group of 51 workers composed of two types of 
employees. The first type were workers who, in the opinion of personnel 
executives and supervisors, had proven to be satisfactory members of 
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the company’s personnel. The second type were employees who, in 
management’s opinion, had consistently shown themselves to be sore- 
heads and troublemakers. The status of the workers (whether an indi- 
vidual was desirable or undesirable from management’s point of view) 
was not revealed to the experimenters at this time. The 51 workers 
knew they were being evaluated but were not told the specific purpose 
of their taking the test. 

Because the experimenters believed that there were approximately 
equal numbers of each type of worker in the group, because the scores 
for each trait, when graphed, showed a pronounced bimodality and flat- 
ness of distribution with the medians falling in the valley between the 
two concentrations of scores, and because in the absence of any previous 
experience with what combinations of scores were most valid in such a 
situation, the scores were first interpreted by classifying as ‘‘undesirable 
temperaments” those cases whose scores were below the median on at 
least two of the three traits. 

This first interpretation was then checked against the company 
ratings of the workers and it was found that the Personnel Inventory 
had placed in the group labelled “‘undesirable temperaments’’ 73°% of the 
workers who had, in management’s opinion, demonstrated on the job 
that they were malcontents and troublemakers. Also in this group 
whose scores indicated them to have “undesirable temperaments’’ were 
34% of the workers whom management had designated as being satis- 
factory employees. 

From these data critical scores were established which would yield 
the maximum diagnostic accuracy and the scores of the 51 cases reinter- 
preted by placing in the “undesirable temperament” category those cases 
which were below the critical scores on at least two of the three traits. 
In this final interpretation using critical scores instead of the medians 
it was found that the Personnel Inventory had classified as “undesirable 
temperaments” 82% of the workers who had, in management’s opinion, 
demonstrated that they were troublemakers and soreheads. Also in this 
group whose scores indicated that they had “undesirable temperaments”’ 
were 38% of the workers whom management had labelled satisfactory. 
The real test of this set of critical scores awaits their use with new groups. 

The second validity experiment was made when the personnel depart- 
ment of a New York textile manufacturing concern administered the 
Personnel Inventory to a group of 43 workers which included 30 who 
were rated by their supervisors as well-adjusted and 13 who were rated 
as malcontents. Eighty-five per cent (11 out of 13) of the workers rated 
as malcontents made a standard score on trait Co (Cooperativeness) of 
4 or less (the lowest 40% of a random sample of 500 workers on which 
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norms for the test are based). Thirty-six per cent (11 out of 30) of the 
workers rated as well-adjusted fell in this range of Co scores. Super. 
visors’ ratings and the scores on factor Co agreed on 31 cases out of 
the 43 (72%). 

Accepting the sample in this validity study as representative of the 
general run of industrial workers, the conclusion might be made that, if 
100 men applied for 60 jobs and the 60 men highest on trait Co were 
hired and the 40 lowest on trait Co turned down, 85% of the men with 
temperaments predisposing to maladjustment would be eliminated 
With normal labor conditions, more than 100 men would probably appl; 
for the 60 jobs and as only the top 60 men on trait Co need be hired, it 
thus may be possible to eliminate an even larger percentage of the work- 
ers predisposed to maladjustment. 

The figures on validity derived from these experiments are considered 
satisfactory at this stage of the test’s development in view of the prob- 
able fallibility of the criteria (opinions of management) and the lack of 
any previous experience with what critical scores or what combinations 
of scores best distinguish the well-adjusted employees from the mal- 
contents in these situations. It should be emphasized that the opinion 
of management was the sole criterion of whether a man was a trouble- 
maker or not. Whether all the individuals so designated were actually 
of paranoid temperament is dubious as is also the soundness of tempera- 
ment of those designated by management as well-adjusted. 

For any individual to become a troublemaker in industry both pre- 
cipitating and predisposing causes must be operative. Therefore, not 
every person with a predisposing temperament is necessarily engaging in 
undesirable activities. He may be in a situation which fails to furnish 
the necessary frustrations to evoke his latent tendencies. At the same 
time, not all the malcontents in a sweatshop would necessarily have 
temperaments predisposing them to troublemaking. This is because the 
immediate situation can be so frustrating as to evoke troublemaking 
behavior in the soundest temperaments. Thus, the problem of the indus- 
trial troublemaker can be aided by the use of the Guilford-Martin Personnel 
Inventory only in companies which have an enlightened industrial relations 
policy and reasonably adequate working conditions. 

A series of experiments in several industrial concerns is now under 
way which parallels the methods used in the validity studies described 
here. Should the accuracy of the Guilford-Martin Personnel Inventory 
continue as high in new situations as in the two described, the test will 
constitute a valuable aid in the work of the psychological counselor and 
personnel administrator in industry. 


Received November 1, 1943. 
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In the December, 1942, issue of the Journal of Applied Psychology 
there appeared an article! which described the differences between the 
scores obtained by a large group of adult males on the Revised Minne- 
sota Paper Form Board test and the norms supplied with the test manua! 
On the basis of scores made by 785 WPA male workers Hanman con- 
cludes: “these norms are probably more nearly representative of un- 
selected male populations than are those supplied in the test manual.”’ 
He attributes this to the fact that his norms were compiled from 785 
cases as compared with 223 cases in the test manual. It is evident 
however, that mere numbers of cases is not an adequate criterion of the 
value of normative data. Furthermore, the fact that his cases wer 
WPA male workers suggests that his “norms” are of value only as a 
basis for describing this particular type of sample of adult male unem- 
ployed workers. 

It is the purpose of the present article to report additional data for 
adult female applicants for factory employment. These data were com- 
piled from scores made by 975 women tested by the Hawk-Eye Works 
of the Eastman Kodak Company, Rochester, New York. These women 
were hired during the fall, winter, and spring of 1942—43 for optical and 
mechanical work. The jobs on which these women were being placed 
ranged from unskilled, highly-repetitive jobs, such as leas wrapping, to 
highly skilled precision jobs of final assembly and inspection. The indi- 
viduals tested presented a wide variation in educational background 


1Hanman, B. The performance of adult males on the Minnesota Paper Form 
Board Test and the O’Rourke Mechanical Aptitude Test. J. appl. Psychol., 1942, 26 
809-811. 

2 Supplementary Adult Norms, Revised Minnesota Paper Form Board, Mimeographed 
Edition. New York: The Psychological Corporation, March, 1943. 
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hich ranged from seven years of schooling to graduation from college. 
Many nationalities and a small percentage of negroes were included in 
the group tested. It is believed that the group is a good sampling of 
the type of female applicant which sought work in the Western New 
York area during this period. 
The tests were all administered in the same room and by the same 
examiner to groups varying in size from 8 to 45 individuals. The direc- 
ms supplied by the publisher were carefully followed, and the standard 
twenty minute time limit was employed. In administering the tests, 
forms AA and BB were distributed alternately to the testees so as to 
educe any tendency towards cheating. All scoring and computation 
{ percentiles were computed by the senior author and checked by the 
inior author. It is believed that these controls reduce to a minimum 


Table | 


Scores Made by Adult Female Applicants for Factory Work 
On the Revised Minnesota Form Board Tests, Series AA and BB 
Compared with Published Norms 


16-25 Female Age Group 26-60 Female Age Group 


Applicants for Published Applicants for Published 
Factory Work Norms Factory Work Norms 


63 58 56 49 
51 49 47 45 
47 46 44 
45 44 41 41 
43 41 39 
41 40 38 
40 39 36 


37 37 
36 
35 34 


No. of Cases 
Median 
Mean 
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the variable effect which different examiners, different testing condition: 
and different scorers might introduce. 

Table 1 gives the raw scores and corresponding percentile ranks of 
the female adult applicants tested at the Hawk-Eye Works compared 
with the published norms. Of the 975 female cases at Hawk-Eye, 54; 
cases were in the 16-25 year age group and 430 cases were in the 26-6) 
year age group. 


Summary of Results 
The data in Table 1 may be summarized as follows: 


1. In the 16-25 year age group, the Hawk-Eye applicants are, wit! 
one exception (35th percentile) equal to or higher than the published 
norms for adult females in this age range. The greatest variation occurs 
at the 5th, 10th, and 15th percentiles where the Hawk-Eye applicants 
are appreciably higher than the published norms. 

2. In the 26-60 year age group, the Hawk-Eye applicants are ap- 
proximately equal to the published norms. The maximum difference at 
any of the percentile points is not more than two points, with the excep- 
tion of the 100th percentile. 

3. At no place in the Hawk-Eye distributions are the scores for th 
26-60 year age group higher than those for the 16-25 year group. In 
the manual provided with the test, the norms for the 26-60 year age 
group are higher than those of the younger group at the 5th, 10th, and 
15th percentiles. The published norms cross over at the 20th percentile 
so that the 16-25 year age group norms are then consistently higher up 
to the 100th percentile. It is possible, therefore, that the data for the 
Hawk-Eye applicants may be more representative of adult female work- 
ers than the original norm group. For this reason, the data in Table 1 
may be welcomed by industrial and vocational psychologists for use ir 
dealing with this type of applicant. 

4. It is important to note that the data obtained by Hanman o1 
785 WPA male workers showed them to score considerably lower than 
the published norms, whereas the present data show that the 975 Hawk- 
Eye female applicants, on the whole, make scores equal to or slightly 
higher than the published norms. 


Received October 15, 1943. 





A Study of the Effect of the Presence of the Examiner 
upon Test Scores in Individual Testing * 


William E. Bingham, Jr. 
Los Angeles, California 


Can a job applicant be ‘‘rattled’’ into failing an employment test by 
virtue of the presence of the personnel examiner? It has been well stated 
that ‘‘A test is valid when it measures what it purports to measure, that 
is, when there is an actual empirical correspondence between test scores 
and proficiency in the activity chosen as a criterion” (8). By easy 
analysis, the presence of the examiner in the individual test situation is 
a factor not present in “the activity chosen as a criterion,’ especially 
where the occupation is a relatively solitary one. And Luria (7) showed 
that even in the usual school examination students are in various degrees 
of emotional instability. Hence, may not the examiner’s presence in 
individual testing be a source of error? 

Books on the administration of tests, such as those by Bingham (1), 
Hull (5), and Link (6), advise the examiner to put the subject at ease, 
obtain rapport, and use practice tests to cushion the shock. But few 
attempts have been made to find out how well the examiner succeeded. 

Ekdahl (3) in a word-association test, administered individually to 
50 subjects, found that a large majority made slower responses in the 
experimenter’s presence and faster ones when completely alone. Pessin 
(9) discovered that subjects memorized nonsense syllables faster and 
with fewer errors when working alone as against working under the 
examiner’s social stimulation. Dashiell (2) in a survey of this problem 


commented on the importance of its implications for vocational testing. 


The Problem 


The problem is to measure the difference between test scores when 
working alone and when working with the examiner present. Three 
tests were chosen: a steadiness test using the Whipple apparatus, a typing 
speed test, and a speed test in the addition of simple whole numbers. 


Procedure 


It may be assumed that the groups taking the test alone and taking 
it in the examiner’s presence are of equal ability because the same sub- 
* A research project directed by Dr. F. L. Ruch, University of Southern California, 
1943. The writer is grateful, also, to Dr. R. R. G. Watt for statistical advice. 
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jects took both tests. The factor of sex could not well be sorted out. 
for the numbers were too few. There were 36 men and 24 women 4 
told with an average age of 21. The examiner was a man of 29 having 
spent at least 200 hours in the laboratory in actual individual testing 
None of the subjects had ever taken a steadiness test. Since all of them 
were college students, they all had elementary and high school experience 
in adding simple whole numbers. With respect to typing experience. 
22 subjects had studied typing in school, 9 subjects had no school but 
some self-education and experience, and 4 persons had no training and 
almost no practice. Although 35 subjects took each test, the ratio of 
men to women was about 36 to 24 in each test. 

For every subject before each of the steadiness, typing and addition 
tests the number of one minute practice trials was continued until the 
last two trials showed little or no improvement over the third from the 
last trial. The majority of the subjects took about 5 or 6 minutes. 
Practice was not carried toward the physiological limit for fear that it 
would leave the subject in a mental and physical condition not best for 
satisfactory performance in the test itself. 

To reduce further the effects of any lack of practice one half of the 
35 subjects worked in the examiner’s presence during the first test period, 
and the other half of the subjects worked alone in their first period. By 
this procedure the effects of more practice obtained in the alone period, 
if it always came first, say, would not always be added to the second 
period in which the examiner was present (9, p. 265). 

Attempts to make Forms B for the typing and addition tests equiva- 
lent to Forms A by using another part of the same quoted magazine 
article for the typing and by interchanging the numbers in the additions 
were put to test. From 16 to 19 subjects took the typing and addition 
tests Form A in the presence of the examiner, and another group of 
subjects also in the examiner’s presence took Form B. Then this latter 
group took Form A alone, while the former group took Form B alone. 
The Typing Means refer to the number of words typed and the Addition 
Means to the weighted number of problems added as defined in the next 
section. The small sampling technique (4, p. 51) when N is less than 30 
was used. 

In all of these tests Form A was given first and Form B second. 
Thus, any practice effects would tend to accumulate on Form B scores. 
In the typing tests if any effects of practice occurred, they were balanced 
by Form B being more difficult than Form A because the differences are 
almost insignificant as judged from the low critical ratios. Or, if it be 
assumed that there were no results of practice, then Form A is equivalent 
to Form B. 





out, 
n all 
Ving 
ting 
hem 
ence 
nee, 
but 
and 


O Ol 


ition 
the 
the 
ites. 
at it 
for 


the 
iod, 
By 
10d, 
ond 


iva- 
zine 
ions 
tion 
) of 
tter 
ne, 
‘lon 
ext 
1 30 


nd. 
res. 
ced 
are 

be 
ent 


Effect of the Presence of Examiner upon Test Scores 


Table 1 


Practice Effects and the Equivalence of Forms A and B 





Form A Form B Form A Form B Differ- Critical 
Test Mean Mean 8. D. 8. D. ence Ratio 





Typing 


Presence 152.7 57. 26.9 + 4.6 .o2 
Alone 153.2 152.7 57.8 .03 


Addition 
Presence 366.7 443.4 +- 76.7 2.0 
Alone 418.3 340.3 f 





In the adding tests since the differences between means are positive 
and negative as well as about equal, practice effects are not at all evident, 
and Form A appears to be equivalent to Form B. The critical ratio of 
the difference when the Presence and the Alone are combined for the 
same 35 subjects taking Form A and then Form B is .18; so these opposite 
differences equalize and may be due to differences in group ability. The 
standard deviations in both sets of tests likewise seem to be in harmony 
with these conclusions. 

A friendly conversation was kept going before the test and during the 
rest periods. No attempt was made to distract the subject; the test 
situation was quiet with no talking. ‘The presence of the examiner” 
is therefore defined in these tests as the examiner being in the room plus 
this carefully repeated condition: The examiner stood about 2% feet to 
the right and 30 degrees behind the subject, occasionally shifted his feet, 
and walked on rubber heeled shoes 10 feet away to a window twice in the 
longer addition test and once in the shorter typing and steadiness tests. 
These acts and positions were taken casually in a way which was relaxing 
to the examiner. It is this ‘‘presence of the examiner’’ which was meas- 
ured. The examiner was closer to the subject than what he was capable 
of being, because in the steadiness test the examiner could not afford to 
leave the room during the short 1 minute trials. This important pattern 
of distance and position was kept the same in the other two tests. The 
only signs that the examiner distracted the subject are those inferred 
from the introspections and test results herein reported. 

Before the “‘Alone”’ period a casual excuse was given for leaving the 
room. However, in the steadiness test during the ‘‘Alone’’ period the 
examiner sat down 12 to 15 feet in front of the subject with his back to 
the subject so that the subject knew that he was not being observed. 
The rest periods were 4 to 1 minute between trials of the steadiness test 
and 1 to 3 minutes between parts of the typing and adding tests. The 
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second part of each test was begun with the explanation, ‘In order ¢, 
find out how stable your score is, a second test is given. . . .” 

In the steadiness test the stylus was held in the next to the smallest 
hole or the smallest during all the 5 one minute trials. The subjects 
typed 5 minutes in each period and added 8 minutes. Speed and accu- 
racy were made the objectives for each subject before he began the 
practice trials. 


Table 2 


Test Comparisons: Working Alone and in the Examiner’s Presence 





Presence Alone Presence 
Mean S. D. 8S. D. 


Alone 


Test N Mean 





Steadiness 35 2.31 3.60 2.35 3.50 
Typing 
Errors 11.0 16.0 6.16 8.44 
Work 152 155 47.4 41.5 
Score 141 141.5 47.2 42.0 


Addition 


Errors 5.41 6.97 3.68 3.96 
Work 379 404.5 113 113 
Score 357 372 118 113 





Results 


In Table 2, the row headed Steadiness, 2.31 is the mean of 35 subjects 


while working alone. And 3.60 is the mean time in the presence of the 
examiner. From the corresponding standard deviations and the product- 
moment coefficient of correlation, r, between the alone and presence 
performances the critical ratio is 3.8. Therefore, the chances are 100 
out of 100 that like populations under like conditions would be more 
unsteady in this examiner’s presence. Although the distributions are 
skewed, the critical ratio of the difference between the medians comes 
to 4.3. 

Under Typing the Errors Mean is the mean number of typing errors. 
The Work Mean is the mean number of words typed. And the Score 
for each subject was obtained by subtracting the number of errors from 
the total number of words typed. In the presence of the examiner the 
subjects made more errors than when typing alone to a degree that also 
is statistically reliable. But no reliable differences can be seen in the 
Work and Final Scores. 

Under Addition the Errors Mean is the mean number of problems 
added wrong. The Work Mean is the mean of all the subjects’ total 
number of problems worked after weighting them. There were 4 types 
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of addition problems or columns. A weighting for each type was ob- 
tained from the time it took all the subjects, on the average, to add 10 
problems of each type. The Work for each subject is the number of 
eolumns of each type which he added multiplied by 2, 3, 4, and 6 respec- 
tively and totalled. The final Score is the Work minus the Errors 
weighted in the same way. 

In the addition tests the chances are over 99 out of a 100 that a 
similar group of subjects would make more errors in this examiner’s 
presence. The effect of making more errors did not appear to counter- 
balance very much the very reliable increase in the number of problems 
added because the Final Score was also greater in the examiner’s presence 
than when adding alone. 

Of some interest are the reactions which subjects gave of their feelings 
toward the presence of the examiner in contrast to his absence. Sub- 
jects gave 17 unfavorable reports, 11 favorable. Examples of unfavor- 
able responses are: ‘I was thinking of what you were thinking of me 
It made me nervous . . . I became more jiggly . . . I felt confused 
... It made me feel self-conscious . . . Gets me angry, irks me.” 

Some favorable reactions: 

“T had a feeling of rapport with you . . . Felt as though I had to 
struggle harder . . . It made me go like mad . . . I felt at home 
It didn’t bother me.” 

An introspection was judged unfavorable if it indicated a disorgan- 
izing emotion or a distraction of the attention. A response was judged 
favorable if it appeared to be of an organizing nature, such as, “makes 
me work harder,” or if it meant indifference (7 subjects reported, ‘‘It 
didn’t bother me’”’). When alone, subjects said they relaxed and felt 
calmer. 

Summary and Conclusions 


The purpose of this exploratory study was to measure the effect of 
the examiner’s presence upon test scores in individual testing as a check 
upon the validity of mechanical ability and personnel procedures where 
the occupation is a relatively solitary one. 

Steadiness, typing and addition tests were given to the same subjects 
in the examiner’s presence and then alone. The results showed that in 
the examiner’s presence the subjects made more errors in each test, and 
thus were less efficient, especially in typing, as compared with working 
alone. Although subjects typed no more words and scored no higher as 
a group in his presence, they very definitely added more columns and 
made better addition scores than when adding alone. The reports of 
subjects on how they felt about the presence of the examiner indicate 
that, to a large degree, he was the stimulus for these effects. 
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The interpretation of these results must also take into account tly 
fact that only one examiner and his undefined personality are here ip. 
volved and that a real applicant testing situation probably stimulates 
more effort and emotion in a subject than does the experimental situation 
At the least, these results point to a need for further research, which has 
for its objectives the improvement of the job applicant’s welfare and the 
predicting power of mechanical ability tests. 


Received October 18, 1943. 
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Motion and Time Study: A Resumé and Bibliography 


Joseph E. Zerga 


Walt Disney Productions, Burbank, California 


Present shortages of skilled, semi-skilled and even unskilled labor has 
made it necessary for industry, more than ever before, to use to the 
fullest extent its available machinery, equipment and manpower in order 
to maintain the high standards of production forced upon it by the war. 
Consequently, it has become necessary to develop new production meth- 
ods through the utilization of a job and worker analysis technique known 
as motion and time study. Motion and time study might also be known 
as methods improvement and job standardization as the end result is to 
eliminate needless and ineffective effort on the part of the worker, thereby 
improving job performance and increasing production. To obtain this 
end result it is necessary for the motion and time study analyst to make 
a scientific analysis of the material, equipment, machinery and methods 
used in performing the operation or series of operations comprising a job. 

Motion and time study were, at one time, considered separate tech- 
niques, there being very little relationship between the two. It should 
be pointed out, however, that although time study was mainly used for 
establishing wage rates and motion study for methods improvement, 
improved methods of performing a job result in a reduction of the time 
element, and an efficiency improvement affects wage rates. Conse- 
quently, motion and time study are being thought of as a single tech- 
nique, concerned not only with an increase in production but the reduc- 
tion of unnecessary fatigue on the part of the worker. In Barnes’ (79) 
definition of motion and time study it is interesting to note that he does 
not differentiate between the two, but considers each an integral part of 
the other. Accordingly, the purpose of motion and time study is: to 
find the most efficient and economical methods of doing a job; the stand- 
ardization of methods, tools, equipment and materials; the accurate 
determination of the amount of time required by an average worker to 
do the job; and, the training of the worker in the new and more economi- 
cal method. 

The general procedure in methods improvement and job standardi- 
zation recommended by Chane (153) is as follows: ‘‘(1) A preliminary 
analysis of the objective to be obtained; (2) A decision as to the best 
procedure of study; (3) The breakdown of the operation into elementary 
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operations or work elements and the collection of data; (4) An analysis 
of the work elements recorded, elimination of waste, etc.; (5) Setting 
up of improved elements and sequences; (6) Determination of the re. 
quired allowances for necessary delays and fatigue; (7) Determinatioy 
of the proper wage rate and, in the case of wage incentive application, 
the establishment of the standard of performance for wage payment: 
and, (8) The formation of a training chart to be used in training old 
operators in the new method or in training new operators.” 

The time-standards engineer at the Lockheed Aircraft Corporation 
(473), Burbank, California, obtains the following specific information 
regarding a job and job conditions prior to clocking the job elements: 


“1. Work Performed—(1) Type of operation; (2) class of work; (3) pur. 
pose of operation; (4) where operation is to be performed; (5) necessity for 
operation. 

2. Operation Procedure—(1) Size of work crew; (2) description of material 
flow to workplace; (3) master list of elements in sequence for set-up, oper- 
ating, and tear down; (4) short analysis of how operation and crew combine, 
or how operator or operators work during the cycle. 

3. Floor Plan—Speed sketch of workplace and work area to scale on an 
81% x 11 inch sheet. 

4. Materials—(1) Type; (2) kind; (3) range of gage; (4) gage and size 
raw; (5) gage and size finished; (6) heat-treat; (7) condition of material re- 
ceived; (8) weight of raw piece handled; (9) weight of finished piece handled: 
(10) defects; (11) shape in which material is used; (12) how material must 
be handled; (13) per cent of scrap; (14) scrap pattern (sketch); (15) how 
scrap is handled and stored; (16) per cent salvable; (17) how material is 
delivered to machine; (18) in what quantities material is delivered. 

5. Jigs or Fixtures—(1) Names and description; (2) how used; (3) num- 
ber of operations performed; (4) handling percentage to operation; (5) con- 
dition of tools; (6) how handled; (7) how material is held in place; (8) how 
cleaning is done when piece is worked on or finished; (9) weight. 

6. Specifications—(1) Tolerance of work; (2) inspections required; (3) dan- 
ger of spoilage; (4) causes of spoilage; (5) care required to handle part; (6) care 
required to handle raw material; (7) finish required. 

7. Machine Tools—(1) Name; (2) type; (3) capacity range; (4) tolerance 
possible; (5) operation performed; (6) maintenance required per 8-hour day; 
(7) range of speeds, feeds, etc.; (8) number of men required to operate; (9) 
efficiency of equipment. 

8. Hand Tools—Names of tools. 

9. Workbenches and Storage Facilities—(1) List of workbenches with full 
description of use, size, etc.; (2) points at which materials are stored for oper- 
ation; (3) how raw material is stored at machine or workplace; (4) how raw 
material is stored at main storage; (5) how finished material is stored at point 
of operation. 

10. General Job Conditions—(1) Light; (2) air; (3) housekeeping; (4) gen- 
eral surroundings and influencing factors; (5) aisle space; (6) equipment lay- 
out; (7) fatigue-causing factors; (8) hazards; (9) interruptions; (10) operator’s 
work position; (11) care required to perform operation; (12) condition of tools, 
jigs, fixtures.” 


Following the obtaining of the foregoing information regarding the 
job, machines, equipment, materials used, job surroundings, etc., the 
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time-standards engineer clocks each job element in terms of motion analy- 
sis, taking care to avoid an overlapping of motion elements which would 


S in turn result in an overlapping of clock readings. 


An interesting example of methods used to train time-study analysts 
is presented by Robbins (405). A chipper, checker, gun welder, fender 
inspector and solderer were selected for intensive training in time-study 
techniques. The course of study lasted eight hours a day for nine months. 
As an occasional test, the instructor would suddenly arise from his chair, 
walk to the classroom door, open it, close it, switch on the wall light, 
switch it off again, bend to tie his shoelace, straighten up, return to his 
chair, and resume his seat. During this period of activity on the part 
of the instructor, the students would break his activity into elements 
and clock, with stop watches calibrated in hundredths of minutes, the 
time consumed in performing each element. This test was repeated at 
varying intervals until the recordings of all the students coincided. 

In addition to training time study men, engineers, and designers in 
motion and time study analysis, the B. F. Goodrich Company (489), 
Akron, Ohio, inaugurated a training program for foremen. To ensure 
effectiveness the program was built around a number of objectives estab- 
lished to train the foremen to: ‘‘(1) Recognize ineffective operations or 
motions; (2) Think in terms of simple devices and gadgets, not intricate 
designs; (3) Look for the utility value of every movement or group of 
motions; (4) Use what he has to the best advantage before advocating 
changes; (5) Become what could be called a “baling wire engineer’; 
(6) See the fatiguing parts of various operations; (7) Sell himself an 
improvement idea first; (8) Sell the worker the same idea and see that 
it is performed exactly right; (9) Have the viewpoint that every job as 
it is now being performed, is wrong; (10) Feel that the smallest idea is a 
big one; (11) Realize that improvements just don’t happen, that some- 
one must suggest them; and, (12) First look for the good in suggestions 
before finding fault with them or turning them down.” 

The RCA Manufacturing Company (253), Inc., Harrison, N. J., has 
followed up its work simplification program with the preparation of 
instruction sheets designed to reeducate operators in the most efficient 
methods of job performance. In initiating the instruction sheet pro- 
gram, however, it was necessary not only to first train instructors and 
key operators in the new methods, but also to secure the cooperation of 
other operators by having them make recommendations regarding im- 
proved methods. This active cooperation between employees and man- 
agement has resulted in general acceptance of the program. 

A specific example of the savings to be derived from motion analysis 
is also illustrated by the RCA Manufacturing Company (55). As part 
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of its work simplification program a revolutionary method of packing 
radio receiving tubes was perfected. The following figures indicate th» 
total savings derived from the new method: 300 per cent in critical ship. 
ping space; 30 per cent in materia! requirements; 20 per cent in labor: ay 
annual saving of $60,000.00 in material and time; and, an annual saving 
of 120 tons of packing material. 

At A. H. Mogensen’s (48) Work Simplification Conference, Lake 
Placid, N. Y., in the summer of 1940, the necessity of looking for the 
‘elementary motions in work methods was stressed. By working out 
simple projects, hand motions were segregated into five classifications: 
use of the fingers only; use of the wrist and fingers; use of the forearm, 
wrist and fingers; use of the upper arm, forearm, wrist and fingers; and, 
use of the entire body, upper arm, forearm, wrist and fingers. The 
demonstrations emphasized the necessity of eliminating complicated 
motions and using only those requiring a minimum expenditure of time 
and effort. 

A common misconception often found amongst workers is that the 
purpose of a motion and time study program is to increase production 
by finding methods of making the worker work faster. Actually, as has 
often been stated, the purpose of a motion and time study program is to 
determine excessive and wasteful motions and eliminate such motions by 
improving and standardizing motions that are absolutely necessary. It 
is necessary not only to thoroughly familiarize every worker in regard to 
the purposes of a motion and time study program, it is also just as neces- 
sary to educate the foremen. Cooperation from line supervision will 
mean the program will not only be more readily accepted but more likely 
to be effective. If feasible, it is advisable to train the foremen in the 
fundamentals of motion and time study techniques. This training will 
enable the foremen to assist the analysts in many respects and also instill 
in them the desire to improve operations in their own departments. 

It should constantly be kept in mind, however, that motion and time 
study investigations must always be made from the standpoint of prac- 
ticality. A motion and time study analysis of every job in an industrial 
organization, especially during wartime conditions, would in most in- 
stances be highly impractical inasmuch as the duties involved in many 
jobs are subject to change over short periods of time. Motion and time 
study analyses should first be made of those skilled and semi-skilled jobs 
that are not likely to change over a significant period of time. 

The preceding resumé has been concerned primarily with motion and 
time study as a joint labor-management program, and has briefly con- 
sidered the following topics: general procedures to be followed in initiating 
a program of motion and time study; the types of information to be 
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cking obtained regarding the worker’s duties, machines, equipment and mate- 

@ the rials used on the job, job surroundings, etc.; training methods and 

ship. objectives; the technique of introducing improved methods to workers; 

D and, the practical aspects of motion and time study investigations. 

Aving The following bibliography covers motion and time study literature 
which has appeared in various journals and other publications between 

Lake the years 1923 to 1942, inclusive. 

: the Received November 5, 1943. 
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28. ——. Ten to one variations in railroad shop time. Amer. Mach., 1933, 77, 4-5 
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Min. Cong. J., 1937, 23, 77. 
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The Life Insurance Sales Research Bureau 


Stephen Habbe 
L.I.S.R.B., Hartford, Conn. 


Shortly after joining this organization about a year ago I wrote a 
friend, not a psychologist, mentioning my new employment. His reply 
was prompt: ‘‘What in the world does one do in a Life Insurance Sales 
Research Bureau—search for sales?” 

His question is a fair one and his answer is close to the truth. For 
twenty-two years since its founding at the Carnegie Institute of Tech- 
nology in January, 1922, the Research Bureau has been interested in 
sales, especially in ways to improve life insurance selling. 

Owning a substantial amount of life insurance in a good company is 
considered today an evidence of ordinary foresight and prudence. Buy- 
ing this insurance, however, is another matter! Dealing with an insur- 
ance agent often is a confusing experience for the average insurance buyer 
and sometimes he is annoyed by it as well. He wants life insurance but 
he is resistive to buying it. 

The American people have embraced the principles of life insurance 
more fully than any other peoples in history. In the United States today 
there are 68 million policyholders owning 140 billion dollars of legal 
reserve life insurance. More than 90 billions of additional life insurance 
has been purchased from the government since 1940 by members of our 
armed forces. The Social Security program, fraternal benefits, and other 
insurance plans add additional billions of protection for American fam- 
ilies. Insurance in America has gone a long ways towards establishing 
one of the Four Freedoms—Freedom from Want. 

The pioneer days in life insurance are over. Almost all persons accept 
the idea of insurance. They want to be protected against dying too 
soon and against living too long. Furthermore, they regard life insur- 
ance as a sound investment and as a good way to save money and to 
build an estate. 

But if the principles and functions of life insurance are widely ac- 
cepted, some of the practices of life insurance are not. Among the 
practices that are criticized are several in the field of life insurance dis- 
tribution, and this is the field of primary interest to the Life Insurance 
Sales Research Bureau. 

The first problem to which the Bureau staff turned its attention in 
1922 was that of the selection of sales personnel. Most persons will 
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testify that the quality of agents, both men and women, selling if, 
insurance has improved since that time, and the Bureau has reason ty 
feel a satisfaction in that improvement. Psychologists will be interested 
in learning that the Bureau published the Aptitude Index for selecting 
life insurance salesmen in 1938 and that over one-third million of th, 
Indexes have been used since that date. Many of the 132 insurance 
companies which are members of the Bureau have made it a policy to 
hire only those candidates who score “A” or “B’’ on the Index. This 
procedure unquestionably is raising the standards of life insurance selling. 

A few years ago the Bureau published a series of studies in the field 
of Morale and Agency Management, and this series has had a noticeable 
effect on life insurance practices, especially in the field offices. Recently 
the Bureau developed a Job Satisfaction form to implement the morale 
studies. The form is used by member companies wishing to survey the 
job attitudes of their agents. 

A current interest of the Bureau is in consumer studies. What does 
the public think of the life insurance companies and of the way they 
operate? How much do they know about basic life insurance principles? 
How do they feel about government insurance programs? 

These are only a few of dozens of problems which have been or are 
being investigated by Bureau workers. The life insurance business cre- 
ated the Bureau to help it find ways to distribute life insurance better 
The Bureau has been committed to a program of research to find these 
better ways, and psychologists have played an important part in this 
program. 

In addition to the research department, the Bureau has a large and 
active service department. Members of this department visit the com- 
panies to discuss research findings and to discover from the companies 
their experiences in subjecting new life insurance ideas to the final test 
of actual field application. Each member company is visited twice 
annually. Other approaches used by the Bureau to disseminate infor- 
mation include: two-week schools for agency managers, printed and 
mimeographed bulletins, a bimonthly magazine, and a large annual 
meeting. 

For the use of its staff and its member companies, the Bureau has 
assembled and catalogued the largest library of life insurance sales mate- 
rials in the world. It currently receives eight of the leading psychological 
journals. The fulltime services of three workers are required to satisfy 
the calls made upon this reference library. 

“Insurance laws’’ are the laws of selection and normal expectancy so 
familiar to psychologists. Insurance is an intangible product and selling 
it means selling an idea. The selling process—selling a piece of paper 
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that calls for an immediate sacrifice for a future good—is charged with 
psychological relationships from beginning to end. The Research Bu- 
reau believes there is a tremendous opportunity for the application of 
psychological findings to the insurance business, particularly to the selling 
phase of the business. For this reason, the Bureau has had one or more 
psychologists on its staff during most of its history. 

A listing of Bureau psychologists, showing dates of service, follows: 
Frederick Hansen, 1922-1923; Marion A. Bills (Miss), 1923-1924; Rich- 
ard Uhrbrock, 1923-1924; Herbert G. Kenagy, 1927-1936; Arthur 
Kornhauser, 1933-1936 (part-time); Rensis Likert, 1935-1939; Albert K. 
Kurtz, 1935 to present (on leave to war project); John M. Willits, 1936- 
1940; and Stephen Habbe, 1943 to present. 


Received November 26, 1943. 








A Study of Relationships to Somatotype * 


Donald W. Fiske 
Lieutenant, H-V(S), U.S.N.R. 


What relationships are there between physique and other aspects of 
the individual? This is an old and recurring question in human thought. 
Scientists from ancient to modern times have tried to relate physique 
with temperament, personality, and susceptibility to mental and physical 
disease without reaching any final conclusions. Probably one reason for 
the appeal of the theories about these relationships is that they would 
afford a pleasantly simple basis for the diagnosis and prediction of human 
behavior if they could be proved conclusively. Today, the question is 
not merely academic. With personnel workers constantly looking for 
convenient measures for predicting vocational adjustment, and with the 
armed services engaged in the largest selection and classification pro- 
grams ever undertaken, every conceivable technique has been brought 
out for critical examination. 

Since the history of theories concerning psychological and physical 
types has been adequately presented by Cabot (4), Wertheimer and 
Hesketh (48), and Anastasi (1), it will not be discussed at length here. 
In addition to these histories, the experimental literature on this topic 
has been surveyed by Polen (35), Paterson (33), Wertham (47), Klineberg, 
Asch, and Block (23), Cabot (4), and Jones (18, 19, 20). References will 
be made here only to studies which are directly relevant to problems 
under consideration. 

Kretschmer, both in his original statement of his thesis (25), and in 
his later summary of experimental work (with Enke (26)), claimed a high 
degree of association between his body types and schizophrenia vs. cir- 
cular psychosis, but other research has cast doubt upon these findings. 
Wertheimer and Hesketh (48) and Garvey (9, 10) have called attention 
to the effect of the age factor upon this apparent relationship. In studies 
of normal adults, we fail to find a clear-cut picture of relationship between 
physique and mental traits. For instance, after her survey of the litera- 


* This research was conducted by the author during his participation in the Adoles- 
cent Study Unit, Phillips Academy, Andover, Massachusetts, which is supported by a 
grant from the Carnegie Corporation. The author wishes to express his gratitude to 
Dr. Robert W. White of the Harvard Psychological Clinic for advice and guidance. 
Opinions expressed herein are the author’s and are not to be construed as official or 
reflecting policies of the Naval establishment. 
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ture, Polen (35) reached the conclusion that among normal individuals 
the pyknic physique and the cyclothymic disposition go together, but 
that the corresponding relationship between asthenic physique and schizo- 
thymic temperament has not been definitely established. Studies by 
yan der Horst (16) and Kibler (22) tend to support Kretschmer’s theory, 
whereas those of Klineberg, Asch, and Block (23) and of Cabot (4) 
obtained contrary results. 

Several factors account for the disagreements among the findings of 
these workers. One is the difficulty of measuring Kretschmer’s global 
classification of temperaments by objective methods. Most experiments 
in this field, especially those performed in America, have considered it 
advisable to confine themselves to separate, isolated variables of per- 
sonality. For this reason, however, many experiments which have sought 
to test Kretschmer’s particular hypotheses have not actually done so. 
Another factor allied to this difficulty is that, in the search for adequate 
measures of mental traits, different studies have used different measures: 
various paper-and-pencil and other “‘personality”’ tests and ratings have 
been frequently employed in this field, so that direct comparisons be- 
tween studies are usually not possible. 

Another factor varied from study to study has been the method of 
classifying the subjects’ physiques. General impressions, single meas- 
urements, and indices based on combinations of measurements have all 
been used. Most of these methods have attempted to place individuals 
along a single continuum or have divided them into only a few types. 

In the problem of classifying human physiques Sheldon (39) has 
proposed a technique which avoids many of the difficulties of the earlier 
methods. Not only does his somatotyping procedure permit a high 
degree of objectivity in the determination of an individual’s body-type, 
but it also provides a comprehensive scheme which adequately represents 
individual differences in physique. In addition, Sheldon (40, p. 400) has 
reported correlations as high as .83 between components of physique 
and corresponding components of temperament. Since these are ex- 
tremely high in comparison with those previously reported in the litera- 
ture, the possibility arises that the technique of somatotyping developed 
by Sheldon might contribute greatly to an objective solution to the 
physique-temperament problem. 

The studies reported in the present paper employed Sheldon’s somato- 
types. By using this improvement over earlier methods for classifying 
physiques, it was hoped that the relationships involved in this problem 
could be more clearly delineated. To avoid the difficulties encountered 
by previous investigators who made subjective estimates, objective meas- 
ures of other variables were used wherever possible. The measurement 
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of temperament and personality presented the most serious problem. |, 
the main study described below, ratings were obtained but had to hyp 
discarded because they were unreliable and undependable. Practica) 
limitations of time and resources made it impossible to use certain othe; 
techniques. 

With the aim of making an extensive investigation of relationships 
to somatotypes in boys, a number of different areas were surveyed 
Intelligence was measured by standard group tests of the multiple choic 
type with time limits, by a vocabulary test, a test of spontaneous thinking. 
and by scholastic average. Motor speed, accuracy, and point pressure 
were studied. Several physiological indices were used: these included 
basal metabolic rate, breathing pattern, electroencephalogram, and blood 
group. The Bernreuter Personality Inventory, estimates of adjustment, 
and responses to inkblots were employed to get at some aspects of 
personality. 

Experimental Procedures 


1. Subjects. In the experiments reported here, the subjects were 
students at Phillips Academy, Andover, Massachusetts. In the group 
studied most intensively, all the boys were above the mean for the general 
population on the Modified Alpha Intelligence Examination. Further- 
more, only 5% came from families with annual incomes of $1500 or less. 
In these two respects this group was highly selected and was therefore 
relatively more homogeneous than the subjects in some previous studies. 

The 133 boys forming the major group of subjects were in the ninth 
grade, with ages ranging from 13 years, no months, to 17 years, 5 months, 
and with a median age of 14 years, 7 months. If two subjects are ex- 
cluded, the range is from 13 years, 6 months, to 16 years, 9 months. 

2. Somatotypes. In his classification of physiques, Sheldon identifies 
three basic components which are present, in varying amounts, in all 
physiques: 


Endomorphy means relative predominance of soft roundness throughout the 
various regions of the body. . . . Mesomorphy means relative predominance 
of muscle, bone, and connective tissue. The mesomorphic physique is nor- 
mally heavy, hard, and rectangular in outline. . . . Ectomorphy means rela- 
tive predominance of linearity and fragility. (39, p. 5) 


An individual’s somatotype (body-type) is designated by three numer- 
als which indicate the relative strength, on a seven-point scale, of Endo- 
morphy, Mesomorphy, and Ectomorphy respectively. The somatotype 
can be determined from a standardized photograph of the individual by 
taking measurements or by the judgment of a trained experimenter. 

Somatotypes were obtained for 176 boys on whom one or more other 
measures were available. Standardized photographs were taken; from 
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these Sheldon determined the somatotypes anthroposcopically (i.e., by 
inspection).! While not so objective a procedure as that of taking meas- 
urements from the photograph, the present method was the only one 
possible, since the tables necessary for more precise somatotyping have 
not been worked out for this age level. 

3. Somatotype groups. While Sheldon has found more than seventy 
different somatotypes in the general population, some are much more 
common than others. For experimental purposes, he has suggested that 
somatotypes may be grouped according to the pattern of dominance 
among the three components (39, pp. 64-65). The present study fol- 
lowed this general plan of grouping except that some of Sheldon’s groups 
were subdivided according to actual strength of the components. From 
nine to twenty-two somatotype groups were used, depending upon such 
considerations as the number of subjects available for each procedure. 

4. Statistical analysis. The analysis of variance was used to deter- 
mine whether each variable used in these experiments was significantly 
related to somatotype group. For a given variable, the variance be- 
tween somatotype groups was compared to that within them: i.e., within 
each of the somatotype groups did the individuals tend to have relatively 
similar scores, as compared to the differences among the means of the 
groups? 

Measures of Intelligence 

Many studies have investigated the relationship between physique 
and intelligence. Naccarati (29) and Naccarati and Garrett (30) opened 
the field by reporting statistically significant correlations between Army 
Alpha and height-over-weight ratio. However, the groups studied were 
small. More thorough studies with better controls, such as that of 
Heidbreder (13) and Child and Sheldon (5), failed to find any significant 
relationship. Most of the work in this field has shown a slight positive 
correlation between height and intelligence, while the relationship to 
weight is usually negligible. Age and socio-economic status have not 
been adequately controlled. 

The main group of 133 subjects in the present study was given several 
intelligence tests in an effort to determine whether certain types of in- 
tellectual functioning are related to physique while others are not. These 
included three group tests. One was F. L. Wells’ new modified Alpha. 
This latest revision of the Army Alpha adds four subtests to the Revised 
Alpha, Short Form, to make a total of four verbal and four numerical 
subtests. Thus Verbal and Numerical scores can be compared. The 
other group tests were the American Council on Education Psychological 
Examination and the Wide Range Vocabulary Test of Atwell and Wells. 


1 The author is grateful to Dr. Sheldon for his assistance in this task. 
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Another intelligence test was administered individually. This was 
the “Good For’ questions from the Kappa Questions designed by Wells 
(46). These questions can be used to measure intellectual energy, spon- 
taneity, and creative thinking. By requiring the subject to construct 
his own answers, the test makes it possible to differentiate between 
spontaneous, creative intellectual functioning and the multiple choice 
sort which is measured by the usual group test. In giving it, the experi- 
menter simply asked the subject, ‘‘What is cotton good for?” and re- 
corded the answer. He then asked the same question about each of the 
following: salt, leather, sugar, aluminum, rubber, water, oil, education, 
and government. The kind of response and its completeness were left 
entirely up to the subject." Following suggestions by Wells (45), the 
present author developed a scoring scheme by which higher scores were 
awarded to more abstract,‘more generalized responses and also to those 
which analyzed the material for the primary reason for its importance 
to mankind. 

Table 1 shows the results of the scores from these tests. None showed 
any significant relationship? to somatotype group. Product-moment 
correlations between each component of physique and vocabulary score 
were all less than twice their standard error. 

A supplementary measure dependent in part on intelligence, scholastic 
average, was also unrelated to somatotype group. This finding disagrees 
with the results found by Sheldon (38) and Pillsbury (34), who both 
report a slight relationship between physique and scholastic achievement. 


Personality Measures 


1. Bernreuter Personality Inventory. The group of 133 boys also took 
the Bernreuter Personality Inventory. Scores on the Neurotic Tend- 
ency, Self-Sufficiency, Dominance, and Sociability Scales were tested for 
their relationship to somatotype group (Table 1). No direct relation- 
ships were found. This result is in agreement with the findings of Kline- 
berg, Fjeld, and Foley as reported by Anastasi (1, p. 251). 

While paper-and-pencil tests have certain well-known defects, Super 
(41) has recently summarized the literature on the Bernreuter and has 


? Throughout this paper,-the phrase “significant relationship’ refers to F values 
(obtained by the analysis of variance) at or above the 5% level of significance: i.e., to 
values which would occur only once in twenty times if there were no demonstrable 
relationship between the variable and somatotype group. For a few variables, there 
was actually significantly more variance within groups than between them. Since the 
proportion of these F values was not significantly more than we might expect to find 
by chance, and since in these cases neither the variable nor the somatotype group can 
be predicted from the other, they have no direct bearing on the basic problem and we 
may disregard them. 
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Table 1 


Relationships to Somatotype Group 





Analysis of variance F values 


22 groups 
Intelligence Tests 


9 groups 





Alpha: verbal score 
Alpha: numerical score 


ACE Psychological Examination 


“Good For’ Questions 
Vocabulary (Atwell-Wells) 
Scholastic Average 


Bernreuter Personality Inventory 
i y 


BI1-N (Neurotic Tendency) 
B2-S (Self-Sufficiency) 
B4-D (Dominance) 

F2-S (Sociability) 


Detroit Motor Speed and Precision Test 


No. circles marked 
No. errors 
No. sheets penetrated 


1.548 
1.428 
1.82t* 
1.25° 
1.50 
1.12* 


1.32 
1.23* 
2.28** 


1.42 


1.04* 
1.03 
1.14* 


Basal Metabolic Rate and Respiration 
Mean of two BMR’s 1.02* 
Difference between two BMR’s LaF 
Respirations per minute during metabolism test 1.30 
Mean height of respirations 1.95* 2.44* 
Product-moment correlations 





Endomorphy Mesomorphy Ectomorphy 


r S.E. r S.E. 
Vocabulary (Atwell-Wells) 14 .08 —.12 .09 
Mean height of respirations — .07 .09 21 .09 





* F values so marked are above the 5% level of significance. 
t For this analysis, 21 groups were used. 
* For these variables, there was more variance within groups than between them. 


found that in general the scales tend to measure what they were designed 
to. It was thought that Ectomorphic groups might be low on Socia- 
bility and the Mesomorphs might be high on Dominance, in view of 
Sheldon’s Temperamental Scale (40, ch. III). These trends were not 
found, however. 

2. Inkblots. A large number of German studies have investigated 
the relationships between physique and reactions to color, form, and 
inkblots. Papers by Munz (28) and Enke (7) indicate that Rorschach’s 
extratensive Erlebnistype tends to be associated with the bulky pyknic 
physique, while the introversive type is probably found with other 








510 Donald W. Fiske 


physiques, especially with the thin asthenic. This difference is explained 
by the findings that pyknics (and perhaps athletics) give more color 
responses of all kinds, while the asthenics give more movement responses: 
the asthenics also seem to be more sensitive to form. These trends are 
in congruity with the characteristics of the temperaments which Kretsch- 
mer claimed to be associated with his physical types (25). 

The Rorschach Test was not used in this study of adolescent boys, 
partly because time limitations made the usual administrative procedure 
out of the question. Moreover, it was hoped that a longer series of 
inkblots would make the test more reliable by providing more responses 
in each of the scoring categories. 

The blots used were 40 small blots on white cards 8 by 10 centimeters 
in size. Of these, 22 were black and white ones taken from a set of 
Gamma Blots developed by F. L. Wells in an unpublished study. The 
remaining 18 were constructed by the author for the present experiment, 
using colored inks with and without black ink. Unlike the Rorschach 
series, many of these cards had asymmetrical designs and many had 
several small blots on the same card. 

The scoring scheme used in this study embraced most of the usual 
Rorschach categories, as described in Beck (2), and in Klopfer and Kelley 
(24), together with some suggested by Wells in an unpublished memo- 
randum on his Gamma Blots. The categories and measures included the 
following list: 


1. Location of responses: Whole responses (W); interpretation of small or 
rarely used details or blots (d); small or rare details mentioned in connection 
with a larger whole (dr); responses to white parts of card (S); failure to inter- 
pret a portion of the card (i). 

2. Determinants of response: Form responses (F); human movement re- 
sponses (M); animal movement responses (FM); movement in objects (m); 
pure color responses (C); color-form responses (CF); form-color responses 
(FC); total color sum (obtained by weighting responses in the three preceding 
categories 114, 1, and \% respectively) (C sum);* vista responses (V); texture 
responses (tex.); large shade responses (Ch); three-dimensional expanses pro- 
jected on two-dimensional surfaces (K). 


* No attempt was made to exclude the inkblot protocols of subjects who showed 
tendencies toward color-blindness. Examination of the protocols (scored without 
knowledge of the subjects’ names or defects) for the two definitely red-green blind 
subjects showed that they gave responses which were scored as color-determined (FC 
or CF). Subjects classified as probably red-green blind also gave color responses. It 
should be remembered that red and green were not the only colors in the blots. Atten- 
tion should also be called to a study of Rorschach protocols of color-blind subjects by 
Brosin and Fromm (3) which indicates that one is justified in using this test with such 
subjects; they show such phenomena as color shock. On the basis of this evidence, 
there seemed to be no reason for excluding the protocols of subjects with color-blind 
tendencies. 
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3. Frequency of responses: Original responses, given by only one subject 
out of the 133 (O); popular responses, given by at least 25% of the subjects 
P). 

4. Other measures: Total number of responses (R); organization responses 
(Z); animal responses (A); use of qualification or limitation in response (L); 
giving elaborations or particulars (part.); use of affective words (a); failure 
to respond to a card within the 30 seconds allowed (Fail.); number of cards 
refused (Ref.); mean time before giving response; ratio of human movement 
responses to total color sum (M sum/C sum); total color sum minus sum of 
human movement responses; sum of form-color responses minus sum of color- 
form responses. 


The Rorschach variables were scored according to common practice 
e.g., number of responses, percentage of Form responses, etc.). In addi- 
tion, all the determinants of responses and most of the other categories 
were dealt with in terms of percentage of responses falling in that cate- 
gory; this procedure was necessary since the total number of responses 
per subject varied from 15 to 91. 

The split-half (odd-even) reliabilities of the various measures were 
calculated for 50 records selected at random. The results are given in 
Table 2, where the median reliability is shown to be .62. While some of 
the less common types of response had low reliabilities, the more impor- 
tant ones (such as total number of responses, percentage of Form re- 
sponses, and percentage of Movement responses) had reliabilities as 
high as .90. 

The reliabilities were, in general, comparable with those obtained 
from the reliability studies of the Rorschach technique made by Hertz 
(15), Thornton and Guilford (42), and others. The attempt to obtain 
improved reliabilities by presenting more blots was not successful. This 
failure was probably due in part to the fact that adolescent and pre- 
adolescent subjects give less reliable records, as Kerr’s study (21) would 
indicate. It is also likely that the methods of administration influenced 
the reliabilities. Thornton and Guilford (42) report decreased reliabili- 
ties with the imposition of time limits. In the present experiment, no 
card was exposed for more than 30 seconds. Also, when the subject 
had given one response to a card and did not soon find something else, 
he was encouraged to go on to the next one. One further difference in 
administration was a prohibition against turning the card around. The 
results of this experiment suggest that the greater reliability which may 
be expected from using more cards may be offset by allowing the subject 
less freedom in reacting to the cards. 

In spite of the fact that Rorschach workers insist on considering an 
individual’s total Rorschach record rather than scores on various isolated 
measures, separate categories and measures had to be employed in the 
present quantitative study. It should be noted that four of the variables 
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Table 2 
Relationships of Ink-Blot Variables to Somatotype Group 
F value Correlation with Components 
Endo. Meso. Ecto. 
Relia- 21 9 —_—___ 

Variable? bility! groups groups r SE r SE r SE 
W% 61t 1.16" = 
d% .65t 1.57* 
dr% .88t 1.60 
S% 33 1.51 1.57 07 09 Ol 09 03 09 
S sum 1.37 
i sum .73t 1.12 
F% -72t 1.09 
M% -79T 1.36 
M sum .59t 1.72* 2.05 -—.01 .09 .07 .09 O01 09 
FM% .53t 1,128 
FM sum 1.13° 
m% .77t 1.00* 
Pure Color % .00 1.44 1.70 —.24¢ 08 —.18 .08 28+ 08 
Pure Color sum 1.27 
CF% (—.02) 1.608 
CF sum 2.27** 
FC% (—.14) 1.42 
FC sum 1.28 
Total C sum -25 1.33 1.53 
V% .55t 3.09f 3.55t —.06 .09 —.10 .09 .03 09 
V sum .63t 1.82* 2.56* O01 09 -—-.05 009 .02 .09 
tex. % .39t 1.358 
Ch% 21 1.41 
Ch sum 1.04 
K% A5t 1.06* 
0% A5t 2.47** 
P% 10 1.06* 
P sum .04 
R .90T 1.21 
Z% .79t 1.16 
Z sum 1.05 
A% .72t 1.29 
L% .89t 1.51 1.52 
part. % .65t 1.39 
a% .63t 1.56 1.76 
Fail sum -74t 1.03 
Ref sum 88t 1.41 
M Resp. time _ 110° 
M sum/C sum 1.47 1.20 
Total C sum — Msum 1.59 1.43 —-.08 09 —.13 09 .13 = .09 
FC — CF 1.308 





Median Reliability .62 


+ Correlation is three times its standard error. 

* More variance within groups than between groups. 

* F value is above the 5% level of significance. 

t F value is above the 1% level of significance. 

1 Split half (odd-even) reliability, corrected for full length of the test. (No correc- 
tion was made in the case of CF% and FC%.) 

? The meanings of these symbols are given on pp. 510-511. 
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used measured the relative strength of certain trends in the individuals’ 
record and thus studied some of the important relationships or balances, 
in conformity with general Rorschach practice. 

All of the measures above were analyzed for relationship to somato- 
type group; the resulting F values are given in Table 2. It will be seen 
that 3 out of 40 are significant (with more variance between groups than 
within them). Percentage of Vista responses was significant at the 1% 
level, while the absolute number of Vista responses gave a value signifi- 
cant at the 5% level. All correlations with individual components were 
negligible. The number of Human Movement responses was significantly 
related to somatotype group although the percentage was not. Once 
again, none of the components correlated appreciably with the variable, 
the largest correlation being barely more than its standard error. Al- 
though neither the percentages nor the absolute number of Pure Color 
responses yielded a significant F value, the percentages correlated —.24 
with Endomorphy, —.18 with Mesomorphy, and +.28 with Ectomorphy, 
the standard error being .08 in each case. However the use of product- 
moment correlations in this case is questionable, since the scores are not 
at all normally distributed: only 21 subjects (about one in six) gave Pure 
Color responses. Half of this small group had somatotypes with no single 
component dominant, whereas only one-fourth of the group as a whole 
fell into this category. This difference in proportions is statistically 
significant. 

None of the other measures gave significant results. This finding is 
not consistent with those reported in German studies (6, 7, 28), as dis- 
cussed above. Furthermore, the traits employed by interpreters of 
Rorschach records are similar to those related to physique in Kretschmer’s 
and Sheldon’s theories. Kretschmer’s asthenics were supposed to be 
unsociable, reserved, sensitive, and excitable (25, p. 124), all of which 
are traits indicated by responses to Rorschach cards—by relative empha- 
sis on form, color, and shade in the interpretations. Sheldon’s Cerebro- 
tonic traits of “sociophobia,”’ “love of privacy,’ and ‘“‘secretiveness of 
feeling, emotional restraint’? (40, Ch. III) have considerable similarity 
to these same Kretschmerian attributes. 

3. Adjustment and Possession of a “‘Good Personality.’ The boys in 
the main group of subjects for these experiments were rated by the school 
physician on general adjustment to the school and to their classmates 
and on the possession of a “‘good personality.”” These ratings were made: 
after a consideration of the opinions of the boy’s classmates, teachers, house- 
masters, and athletic coaches and one or more interviews between him and 
the student. . . . The traits that classified the student as having a poor 


personality were seclusiveness, shyness, excessive nervousness, extreme irrita- 
bility, marked emotional instability, eccentricities, asocial behavior, or general 
, ’ ’ ’ 
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inability to get along with his fellows. . . . Boys classified as having good 
personalities were those who got along extremely well with fellow students. 
were making an excellent adjustment to their new situation, and had no 
significant defects of personality or behavior which were apparent to their 
teachers or the school physician (8). 


More than half the boys were rated average. Similar ratings were 
made on the two preceding classes on the basis of their personalities and 
adjustment during their first year (ninth grade) in the school. 

A comparison of somatotypes was made between those in the main 
group of subjects with “good”’ ratings and those with “‘poor.’’ The fre- 
quencies of subjects falling into 13 somatotype groups did not differ 
significantly. The same lack of relationship between ‘“‘good”’ personality 
and somatotype was obtained when all three school groups were taken 
together. 

It had been expected that the poorly adjusted boys might have a 
greater proportion of somatotypes with no component dominant. A 
brief preliminary study of 20 boys attending a psychiatric clinic had 
revealed that 40% had no single component dominant in their physiques, 
whereas only 22% of an Andover group of comparable age fell into this 
category; this difference in proportions was found to be statistically 
significant. Since the poorly adjusted boys at this school had relatively 
minor difficulties in comparison with the clinic group, we can conclude 
only that mild maladjustment in this private school group is not related 
to somatotype: the relationship between serious maladjustment and 
somatotype must await a thorough investigation.‘ 


Motor Speed, Precision, and Point Pressure 


Each subject-in the main group was individually given the Motor 
Speed and Precision Test, taken from the Detroit Tests of Learning 
Aptitude. In this test, the subject is allowed four minutes to make 
crosses or “‘x’s’’ as rapidly as possible in a series of circles graduated from 
17 to 2 millimeters in diameter. Arrangements were made to measure 
point pressure at the same time, by interleaving blank sheets and carbon 
paper (of smaller size than the blank sheets so that no edges showed) 
beneath the sheet with the printed circles. All subjects used the same 
mechanical pencil and the same lead. Since accuracy and speed were 
given equal emphasis in the instructions, the subject could concentrate 
on marking many circles carelessly and hastily, or on neatly confining his 
crosses to the circles. 


‘In an effort to obtain personality measures of a different type, each boy was rated 
by one or two raters on a 31 item scale. A number of the variables were essentially 
the same as traits allegedly related to physique. However the ratings proved to be 
unreliable since the raters were not trained in rating. Only two variables (not a sig- 
nificant proportion) were related to somatotype group. 
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Motor speed was measured by the number of circles filled, precision 
by the number of crosses extending beyond the circles, and point pressure 
by the number of sheets through which the impressions penetrated suffi- 
ciently to leave marks from the carbon paper. The reliability of scores 
on number of sheets penetrated was determined by correlating two judges’ 
estimates; the correlation was .85. 

None of these three measures was significantly related to somatotype 
group (see Table 1). This finding does not confirm other studies in this 
field. Oseretzky’s results (31, 32) suggest that the thin asthenics have 
better fine motor coordination than individuals with other types of 
physique, and that they may also be more precise. Furthermore, the 
subjects with non-asthenic physiques are said to show crude movements, 
according to Gurevitch (12); a lack of motor relaxation is also attributed 
to them by Kretschmer and Enke (26). None of these conclusions was 
supported in this experiment.5 


Some Non-Psychological Measures 


1. Electroencephalograms. Electroencephalography is a _ relatively 
new field. While there is evidence, from the work of Lemere (27) and 
of Travis and Gottlober (43, 44), for the individuality and consistency of 
EEG records, several major problems remain: the technique for analyzing 
these records, their significance for the individual, and their relationships 


to psychological and physical variables are not yet fully established. 

Studies by Lemere (27), Gottlober (11), and Henry and Knott (14) 
have attempted to relate aspects of personality to the EEG. Papers by 
Saul, Davis, and Davis (36) and by Jasper, Solomon, and Bradley (17) 
suggest that certain broad characteristics of personality are associated 
with EEG patterns. Since some of this work had employed traits also 
thought (by Kretschmer and others) to be related to physique, the possi- 
bility of a relationship between physique and the EEG appeared to be 
worth exploring. 

EEG’s * were obtained from 176 boys of 14 or 15 years who were in 
the class providing the main group of subjects in these experiments or 
the class just ahead of it. Age was controlled because there is evidence 
that EEG patterns vary with age. These records were analyzed into 
spectra according to the Grass method of frequency analysis, so that for 
each individual the relative voltage or energy at the different frequencies 
was indicated. These energy patterns were classified into three basic 


5In another attempt to correlate motor performance with physique, three judges 
sought to predict somatotype from samples of the handwriting of 95 subjects; their 
predictions were completely unsuccessful. 

6 The recording and the analysis of the EEG’s were done by Dr. F. A. Gibbs and his 
assistants, Boston City Hospital, Boston, Massachusetts. 
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groups: Group A included those records showing a high, simple peak at 
some one frequency; Group B contained those with a weak and complex 
peak; the records in Group C showed no peak. The A and B groups 
were subdivided according to the location of the peak: i.e., whether it 
fell at 9, 10, 11, or 12 waves per second. 

No significant relationship was found between these EEG groups and 
somatotype groups. Another analysis comparing average strength in 
each component for each EEG group also failed to show any relationship. 
We obtained no evidence of any association between cortical activity and 
physique in these superior adolescent boys.” 

2. Metabolism Records and Breathing Pattern. Basal Metabolic Rates 
were determined on two successive days for 131 boys from the main 
group of subjects. The mean of these two determinations showed no 
significant relationship to somatotype group. The size of the difference 
between these two determinations was also unrelated. 

From the breathing record obtained in the first test, the mean num- 
ber of inspirations per minute and the mean height of inspiration were 
calculated; while the number of inspirations was not significantly related 
to somatotype group, the height of inspiration was (see Table 1). Prod- 
uct-moment correlations between this variable and the three components 
of physique were not statistically significant; the highest was +.21 
(S.E. = .09) with Mesomorphy. Comparison of the means of the vari- 
ous somatotype groups showed a similar trend, the Mesomorphic groups 
tending to have slightly higher mean heights of inspiration. 

3. Blood Group. Since Schaer (37) has reported a relationship be- 
tween blood group and physique, the Andover subjects were studied to 
determine whether this finding could be repeated with somatotype group- 
ings. Using chi-square, it was found that, for each of the four standard 
blood-type groups, the distribution of somatotypes did not differ signifi- 
cantly from the general distribution for these adolescents: there was no 
evidence of any relationship between physique, as classified into somato- 
type groups, and blood-type. 


Summary 


A group of private school boys (N = 133 to 176) were somatotyped 
according to Sheldon’s procedure for the classification of physiques. 
They were then grouped on the basis of similarity of physique into 9 
to 22 groups. The analysis of variance was used to test whether the 
variables employed in this study were related to somatotype group. 


7 Cf. Gallagher, Gibbs, and Gibbs (8), who report a relationship between normal 
electrical activity in the cortex and “average” personality in a group of subjects which 
largely overlapped with those used in the present experiment. 
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Several intelligence tests and also scholastic average were found to 
be unrelated to this classification of physiques. Four scales of the 
Bernreuter Personality Inventory showed a similar lack of relationship. 

Out of 40 variables used to score responses to a series of inkblots, 
3 showed a significant relationship to somatotype group: these were the 
percentage and also the absolute number of Vista Responses, and the 
number of Human Movement Responses. These were associated not 
with components of physique but rather with patterns of component 
dominance. Pure Color Responses showed a low correlation with Ecto- 
morphy. 

Ratings on good adjustment and possession of a ‘good personality” 
showed no relationship to somatotype group. Measures of motor speed, 
precision, and point pressure also produced no statistically significant 
findings. Electroencephalograms were analyzed to determine the rela- 
tive energy at the different frequencies; the resulting energy patterns 
were unrelated to somatotype groupings. Although basal metabolic rate, 
variability of metabolic rate, and number of inspirations per minute 
during the determination of basal metabolic rate were unrelated to 
physique, the mean height of inspiration proved to be slightly associated 
with the Mesomorphic component of physique. Blood group and so- 
matotype group were uncorrelated. 

The number of significant findings in this study of adolescent boys 1s not 
greater than chance expectancy. The use of Sheldon’s improved procedure 
for classifying physique yielded the same paucity of significant relationships 
to physique that has been found in earlier studies. 


Received September 17, 1943. 
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Interest in and Value of College Courses 


A. Q. Sartain and E. Graham Waring 
Southern Methodist University 


There is often much discussion of the value which college students 
attach to the various courses they take and of the interest they have in 
them. Closely related is the question of whether some teachers are 
thought to provide more interesting and valuable courses in the same 
college subjects than do their colleagues. 


Construction of the Scales 


The present study represents an attempt to measure the general 
value and the interest value of certain college subjects. Two attitude 
scales, one on the general value of a course and the other on interest in 
a course, were constructed and administered to more than 500 college 
students. These scales were built by means of the technique of equal- 
appearing intervals. About 90 students in sophomore psychology courses 
were asked to contribute statements as to the value and, later, the interest 
value, of courses they were then taking, considering each course sepa- 
rately without, however, naming the course. In general, each student 
contributed five statements on each topic. The resulting statements 
were then edited, duplications eliminated, grammatical errors corrected, 
and other necessary changes made. The result was 4 group of more 
than 100 statements on each topic, ranging from highly favorable to 
highly unfavorable opinions. These statements were then presented to 
104 other college students, all upperclassmen and graduate students, who 
served as judges of the favorableness or unfavorableness of each state- 
ment, placing them in categories from I (least favorable) to XI (most 
favorable). Fifty-two judges were used for each scale. 

For each item on each scale there were then determined, first, its 
median category of placement, and second, its quartile deviation. The 
final scales consisted of 21 and 22 statements for General Value and 
Interest respectively. These statements were selected so as to be as 
equally spaced as possible (with regard to median category of placement) 
from least favorable to most favorable and so as to keep quartile devia- 
tions as low as possible. The score of one who fills out a scale is the 
median of the category values of the statements accepted. 

The final scales were entitled “General Value of ____—_”’':~ and 
“Interest in ,’ those using the scales being instructed to fill in 
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the name of the course in question. Instructions printed on each of the 
scales were alike except at two points, and were as follows: 


This is a study to determine how valuable (or interesting) this course is. 
You will not be marked right or wrong as on an examination, since these are 
only statements of opinion. The statements below represent the opinions of 
some people of the general value (or interestingness) of this course, and we want to 
see whether or not you agree with them. 

Put a check mark (¥) in the parenthesis in front of the statement if you 
agree with it. 

Put an ‘‘X’”’ (X) in the parenthesis in front of the statement if you disagree 
with rt. 

Put a question mark (?) in the parenthesis in front of the statement if you 
cannot decide whether you agree or disagree. 

EVERY STATEMENT SHOULD HAVE A (¥v), AN (X) OR A (?) 
BEFORE IT, AND NO STATEMENT IS TO BE CHANGED IN ANY 
WAY. 

Statements comprising each scale are set out below. The median 


category of placement and the quartile deviation for each item are also 
shown, but were not included in the copy given to the student. 


General Value of ___ 


Median Q Statement 


3.0 83 . This course is of little value to me since it has little mate- 
rial that I will remember. 
71 . This course is of some general value to everyone. 
.62 3. This course will be of great value in understanding people 
and situations now as well as later. 
. This course lacks value because one can never use any of 
its benefits later on. 

5. This course has both cultural and practical value. 

}. This course will probably have a good deal of value. 

7. This course is valuable now, but the information will prob- 
ably be discarded in the long run because of changing 
conditions. 

. This course has no long run value at all—it is the most 
useless course I have ever taken. 

. This course is of exceptional value. 

. This course will be very valuable in training one to deal 
with everyday life. 

. This course is of only medium value to me. 

. This course is of practical value. 

3. This course has some value in certain respects, but other- 
wise very little. 

. This course will be of the utmost value, now and all 
through life. 

. This course will aid me greatly in later life. 

. The sole value of this course consists in the fact that one 
must have it in order to get on to more interesting work. 

. This course has fairly good long run value. 

. This course has no connection with my future plans, and 
therefore not much value. eA 

. This course has value only in a few widely separated 
parts. 
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1.5 75 
2.1 30 
Median Q 
1.6 63 
8.6 66 
5.2 65 
2.6 66 
9.7 61 
2.1 42 
6.6 87 
5.7 67 
3.0 42 
7.9 93 
9.0 50 
5.0 1.12 
44 1.21 
7.6 1.04 
1.1 27 
6.2 .68 
6.8 85 
10.0 54 
10.5 57 
3.4 1.06 
11.0 27 
3.8 85 
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20. This course has little or no value. 
21. This course has very little of value to offer. 
Interest in ? 
Statement 
1. A very dull course with poor material. 
2. This course is rather fascinating. 
3. This course is mediocre. 
4. This course is too monotonous. 
5. This course is very interesting. 
6. I have little or no interest in this course. There is no 
stimulus, just hard work. 
7. This course is not always interesting, but sometimes it is 
very interesting. 
8. This course is not very interesting and not dull, but it is 
just in-between. 
9. This is a dull course, probably because it generalizes too 
much and never gets down to specific cases. 
10. This course is mildly interesting at all times, and vitally 
interesting sometimes. 
11. This course is interesting because one learns something 
new each day. 
12. This course could be interesting but the lectures are 
boring. 
13. Only now and then is this course interesting. Sometimes 
it is extremely boring. 
14. This course is interesting the majority of the time. 
15. This is the least interesting course I have ever taken. 
16. This course is interesting, but not especially so. 
17. This course is fairly interesting. 
18. This course proves very interesting in all cases. 
19. This course is very interesting; in fact, the hour flies by 
too quickly. 
20. I force myself to be interested in this course. 
21. This of all classes I have ever had is the most interesting 


22. 


and the most appealing. 
This course is below average in interest because it is 
sometimes hard to understand. 


It will be noted that the median of the category values for the Value 
scale is 6.0 and for the Interest scale is 5.95. The median of the Q-values 
is .67 (Value) and .665 (Interest). 


History of the Problem 


The technique of equal-appearing intervals used in this study was 
devised by Thurstone ' and has been clearly described by Bird.2. A study 


* We wish to express our appreciation to Miss Catherine Alexander, who did most 
of the work on the Interest scale and who helped in the administration of the scales, 
and to Dr. W. H. Lichte, who assisted in planning the study. 

1 Thurstone, L. L., “The measurement of social attitudes,” J. Abnormal and Social 
Psychology, 1931-32, 26, 249-269. 

* Bird, C., Social psychology. New York: D. Appleton-Century Co., pp. 142-172, 
1940. 
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similar to this one has been reported by Remmers and Silance,? who 
constructed an attitude scale toward ‘‘High School Subject X,” using the 
method of equal-appearing intervals. No results are reported for specific 
subjects, however, and the suggestion is made that as many as four sub- 
jects might be judged on the same sheet. In the present study each 
subject was judged only in a regular class period, and only one subject 
appeared on a sheet. Furthermore the present study, instead of attempt- 
ing to measure the attitude toward the subject in general, represents an 
attempt to measure attitudes toward two specific characteristics of the 
subject, interest and value. 


Administration of the Scale 


Partly in an attempt to check the reliability and the validity of the 
scales and partly to study attitudes toward the courses, the scales were 
administered to 504 college students in various courses. As has been 
said, all students were tested in their regular classrooms, but in every 
case the instructor was required to leave the room while the scales were 
being filled out. Students were instructed not to write their names on 
the blanks, and were assured that they would remain anonymous. They 
were also asked to encircle the mark they received in the first half of the 
course if they took it during the preceding semester. All scales were 
filled out in the week preceding final examinations for the second semes- 
ter. Groups which participated in the study were 31 students in a 
sophomore Government course, 119 students in a freshman English 
course, 120 students in sophomore and advanced Psychology courses, and 
234 students in a freshman orientation course in the Social Sciences. 
Incidentally, since only Social Science was represented by all students 
enrolled, and since freshman and advanced courses are compared, the 
rank of the departments may have relatively little significance. 


Results 


Table 1 shows the scores for the total group and also for each depart- 
mental group on each scale. It should be remembered that the scores 
could vary from approximately 1 (least valuable or least interesting) to 
11 (most valuable or most interesting). The critical ratio (positive or 
negative according to whether the departmental mean exceeds or is ex- 
ceeded by the mean of all the students) shown in each case compares the 
departmental groups to the total group, and indicates that Government 
is significantly above the mean, that Psychology probably is, at least in 
Value, and that Social Science is definitely lower. Furthermore, a com- 


* Remmers, H. H., and Silance, E. B., Generalized attitude scales. J. Social Psy- 
chol., 1934, 5, 298-312. 
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Table 1 


Scores of All Students and of Each Departmental Group on Each Test 





Interest Scale Value Scale 














Subject Number Mean SD SD, CR Mean SD SD, CR 
Total 504 6.39 1.72 .076 —- 6.91 1.70 .076 

Gov. 31 7.08 1.46 .261 4.91 7.57 1.05 .188 5.52 
Psy. 120 6.67 1.94 .178 2.41 7.22 1.81 166 2.76 
Eng. 119 6.60 1.45 .133 2.13 6.97 1.54 141 0.60 
Soc. Sci. 234 6.04 1.66 109 —3.79 6.63 1.74 113 —2.96 





parison of Social Science with the other courses shows that it is signifi- 
cantly lower in each case. 

Another problem to be dealt with concerns differences between in- 
structors as to both Value and Interest. Tables 2, 3, and 4 give this 
information for the various departments (only one instructor of Govern- 
ment being represented). The critical ratio again compares the mean 
for each instructor with the total in that department, and it will be seen 
at once that there are significant differences within each table. After 
inspection of these tables one readily concludes that student opinion of 
interest in and value of a course seems to depend as much on the indi- 
vidual teacher as on the content and designation of the course. Inci- 
dentally, what is perhaps the best evidence for the validity of the scales is 


Table 2 
Scores of All Students of English and of Students of Each Instructor of English 





Interest Scale Value Scale 














Instruc- Num- o 
tor ber Mean SD SDa CR Mean SD SDa CR 
Total 119 6.60 1.45 .133 -- 6.97 1.54 141 
A 20 7.48 1.07 .239 4.91 7.60 1.01 .225 3.54 
B 54 6.94 1.15 .156 2.32 7.48 1.05 .142 3.62 
Cc 45 5.95 1.59 .237 —3.67 6.14 1.79 .267 —4.26 
Table 3 


Scores of All Students of Psychology and of Students of Each Instructor of Psychology 





* Interest Scale Value Scale 











Instruc- Num- 
tor ber Mean SD SDa CR Mean SD SD. CR 
Total 120 6.67 1.94 .178 — 7.22 1.81 .166 -- 
A 39 7.86 1.41 .229 5.90 8.04 0.62 .100 6.37 
B 38 6.76 1.75 .288 0.42 7.41 1.69 .278 0.89 
S 43 5.50 1.89 .292 —5.12 6.30 2.13 329 —3.91 
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Table 4 


Scores of All Students of Social Science and of Students of Each Instructor 





Interest Scale Value Scale 
Instruc- Num- - 
tor ber Mean 


SD SDa CR Mean SD SD, 


Total 234 6.04 1.66 .109 6.63 1.74 113 
25 6.64 1.68 .342 3.09 7.06 1.68 342 
52 6.41 1.62 .227 2.36 6.97 1.43 .200 
39 6.14 1.41 .229 0.62 7.36 1.28 .208 
68 5.99 1.82 .223 —0.37 6.22 1.88 .230 —2.56 
26 5.73 1.50 .299 —1.73 6.21 1.72 344 —2.14 
24 4.96 1.08 220 —7.96 5.90 1.75 365 —3.63 


‘ 
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to be found in Tables 1-4, where the differences that proved statistically 
significant would most likely, with one or two outstanding exceptions, 
have been anticipated by one familiar with the courses and instructors. 

As was previously mentioned, students were asked to indicate the 
mark they received in the course the preceding semester. Of the 234 
students in Social Science, 175 reported marks (the others presumably 
not having been in the course the previous semester). The coefficient of 
correlation between Interest:scores and marks was .112 and that between 
Value scores and marks was .090. It may be that the students did not 
truthfully report their marks; or it may be that there is an absence of 
any close relationship between marks and the student’s estimate of 
interest in or value of a course. Which of the conclusions is correct can 
hardly be decided on the basis of this study. 

Another problem available for study was the relation between Interest 
and Value scores.‘ As Table 5 indicates, the correlation here was rela- 


Table 5 


Correlation between Interest and Value Scores 








Department Number r 


Total 504 .702 
English 119 .653 
Government 37 old 
Psychology 120 .753 
Social Science 234 .703 





tively high, being .702 for the total group.That judgments of interest in 
and value of college courses are decidedly correlated is certainly not sur- 
4 Each student was requested to put a number on the first scale filled out and to put 


the same number on the other one. This made correlation between Interest and Value 
possible. 
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prising. The fact that the coefficient is no larger, however, would indi- 
cate that two different characteristics are being measured and might 
throw some doubt on whether it is wise to attempt to measure general 
attitudes toward school subjects. 

Finally, an attempt was made to check the reliability of the scales. 
The method employed was the correlation of scores on the first half of 
each scale against those on the last half. The group used was the 234 
students of Social Science, and the coefficients obtained were .776 (In- 
terest) and .736 (Value). Corrected by the Spearman-Brown formula, 
these become .879 and .848 respectively. If the scales were extended to 
twice their present lengths, the coefficients would become .933 and .918 
respectively. In view of the fact that the score on each half of each scale 
was determined by finding the median value of the statements accepted, 
and since frequently the score depended upon the value of a single state- 
ment, the coefficients of reliability seem to be rather high. 


Summary 


Two attitude scales, one for interest in and another for value of college 
subjects, were built by the method of equal-appearing intervals. These 
scales were then administered to 504 students of English, Government, 
Psychology, and Social Science. On the basis of the results obtained the 
following conclusions seem to be justified: 

1. The reliability of the scales is moderate but satisfactory. 

2. Because the scales showed significant differences between both 
subjects and instructors and because most of these differences seem 
reasonable to one who knows the prevailing student opinion, it is con- 
cluded that the scales probably have some validity. This, however, is a 
point that needs further investigation. 

3. There are, in the opinion of college students, differences in the 
interest value and in the general value of college courses, although it is 
not presumed that the order found in this study is necessarily correct, 
since the sampling was small and freshman, sophomore, and advanced 
courses have not been separated in some groups. 

4. Differences between instructors seem to be even greater than 
differences between departmental subjects. 

5. The relationship ‘between interest in college courses and marks 
reported in them is negligible or almost so. The same statement may 
be made concerning the student’s estimate of the value of the course and 
marks. 

6. Interest in courses and value of courses, as these are judged by 
students, are fairly closely related. 


Received September 30, 1943. 
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Discussion of Dorcus’ Study of the Humm-Wadsworth 
Temperament Scale * 


Doncaster G. Humm 


Los Angeles, California 


Professor Dorcus’ report of his study of the Humm-Wadsworth Tem- 
perament Scale and the Guilford-Martin Personnel Inventory in an 
industrial situation leads me to offer some comments which seem per- 
tinent and which I feel should come to the attention of readers of his 
report. 

In the first place, I feel that his study does not deal with the validity 
of the two instruments he is considering but rather with their applica- 
bility to an industrial situation. Validity means the capability of an 
instrument to measure what it purports to measure. The Humm- 
Wadsworth Temperament Scale indicates the temperamental pattern of 
the subject. Consideration of this temperamental pattern in detail re- 
veals the relative state of the subject’s mental health. An important 
purpose served by the evaluation of an employee’s mental health is to 
afford an estimate of his ability to withstand strain. Success at work 
depends upon the balance between the ability to withstand strain and 
the amount of strain present. As a consequence, the study of the appli- 
cability of the Temperament Scale to success on the job should include 
consideration of the suitability of the placement of the subjects from the 
point of view of the factors which may cause strain, such as intelligence, 
skill, physical condition, interest, and even competence of supervision 
and maladjustments away from the job. 

There are two ways in which one may measure a factor in a complex 
situation. The first is by measuring all of the pertinent factors of the 
situation and then partialling out the effect of the factors one does not 
wish to have considered. In a study of the effectiveness of temperament 
in industry, this could be accomplished by measuring the effect of intelli- 
gence, skill, physical condition, interests, etc., in accomplishing a good 
adjustment. Then, the effect of one of these measures could be con- 
sidered by partialling out the effect of the others. 


* Dorcus, R. J., A brief study of the Humm-Wadsworth Temperament Scale and the 
Guilford-Martin Personnel Inventory in an industrial situation, J. appl. Psychol., 1944, 
28, 302-307. 
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A second way in which one may measure a factor in a complex situa- 
tion is by controlling all factors except the one under consideration. |; 
follows that if one were to evaluate temperament on the job such an 
evaluation could be considered if one were to take into account the 
reactions of only those workers whose intelligence, skill, interests, and 
physical fitness were well adapted to the job and, in addition, to conside: 
those workers only who were working under recognizably competent 
supervisors. While the selection of such a group of workers would un- 
doubtedly be a difficult task, it does not seem feasible adequately to 
consider one element of employee effectiveness without keeping these 
other factors constant. . 

It has been our observation that individuals of poor temperament very 
well placed often do not manifest behavior difficulties, while individuals 
of good temperament badly placed often do. It has frequently occurred 
in our experience that a Temperament Scale result which had been con- 
sidered inaccurate for a considerable period has been found to be accurate 
after long-time observation of the subject. 

The absence of any consideration of factors which contribute to strain 
limits the significance of the presence or absence of behavior problems in 
Professor Dorcus’ subjects. 

In the second place, we differ with Professor Dorcus’ statement: 
“The scorer who possesses the scoring keys and Manual of Interpretation 
of profiles can interpret the Humm-Wadsworth Temperament Scale.” 

There have been several studies of the Humm-Wadsworth Tempera- 
ment Scale made by individuals who have had merely the manuals and 
the scoring keys to direct them. In almost every one of these, the tech- 
nician has fallen into one of several possible pitfalls. It would seem that 
if the Scale is of sufficient importance to warrant research upon it or 
with it, the time required to secure acquaintance with the special tech- 
niques and the precautions of administering, scoring, and interpreting it 
is time well spent. It is for this reason that we have offered an oppor- 
tunity for securing these techniques without charge to members of the 
psychological profession who may be interested. 

Unfortunately, Dr. Dorcus’ study contains an example of one of these 
pitfalls. This is evidenced by his statement, ‘‘They (the test subjects) 
were told that the company was interested in developing future hiring 
procedures and that the results of the tests would not jeopardize their 
positions in any way.” Such reassurances violate the test conditions 
necessary for using the Temperament Scale, since the Scale was stand- 
ardized on persons who were deliberately placed on the defensive. This 
was done because employees and applicants usually feel defensive when 
confronted with tests. As a consequence, the truthfulness or lack of 
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truthfulness in answering the questions of the Temperament Scale is not 
as pertinent as the agreement of the responses of the test subjects with 
the responses of individuals of known temperamental constitution. 

In Dr. Dorcus’ study, the effect of his reassurances is indicated by the 
fact that out of the 48 cases only 22, or 46 per cent, responded to the 
Scale with acceptable frankness, as indicated by the No-count; 19, or 
39.5 per cent, responded with doubtful frankness; while 7 subjects, or 
14.5 per cent, responded with unacceptable frankness. While frankness 
of response has been found to vary somewhat with the test conditions 
(from 70 per cent to 80 per cent acceptable), such extreme variations as 
these have not been observed except when the test conditions have been 
violated. 

In this connection, attention should also be given to the amount of 
agreement between the interpretations made by our staff and by Dr. 
Doreus. It should be observed, first of all, that our own interpretations 
were made with the assumption that test conditions were adequately 
met and that the fact that they were not met casts doubt on our inter- 
pretations. However, assuming that they were valid, it should be noted 
that our interpretations were made on a seven-point scale, viz: “Very 
Good,” “Good,” “Fair,” “Doubtful,” “Doubtful Minus,” ‘Poor,’ and 
“Very Poor,” with qualifying remarks regarding suitability of placement. 
As a consequence, the reconciliation of our findings with those of Dr. 
Dorcus represents a forcing of all doubtful cases into the unsatisfactory 
group and the ignoring of our qualifying comments. Furthermore, the 
agreement between the interpretations of Dr. Dorcus and those of our 
staff was, in the first instance, in 40 out of 48, or 83 per cent of the cases. 
This should be the agreement considered, since a psychologist making 
interpretations from the manuals and key alone would not have an 
opportunity of revising his interpretations in conference, as was done in 
this instance. In contrast to this, psychologists who have attended our 
seminars have had no difficulty in reaching an agreement of approxi- 
mately 94 per cent with the norms of a test of ability to interpret the 
Scales based on the responses of some fifty-odd technicians. 

It should be pointed out that Professor Dorcus is conservative in the 
conclusions he has made and recommends that the findings he has re- 
ported should be accepted as tentative. 

I have the highest regard for Professor Dorcus’ ability and regret the 
necessity of taking issue with him on these points. 


Received September 11, 1944. 





News and Notes 


Dr. Willard C. Olson, Secretary of the American Psychological Asso- 
ciation, has submitted the following news note: 

“Members at the Annual Meeting of the American Psychological 
Association voted unanimously to adopt the new set of By-Laws devel- 
oped on the basis of recommendations of the Inter-society Constitutional 
Convention. The action was preceded by a mail ballot on the opinion 
of Members and Associates and by unanimous enabling ‘egislation at the 
business meeting of the American Association for Applied Psychology. 
The present By-Laws of the APA were amended for a transitional year 
designed to make the reorganization effective at the Annual Meeting 
in 1945. 

“The election of the following officers and representatives was an- 
nounced at the Annual Meeting of the American Psychological Asso- 
ciation: President: Professor Edwin R. Guthrie of the University of 
Washington; Council of Directors: Professor Carl Rogers of Ohio State 
University and Professor Dael Wolfle of the University of Chicago; 


Nominees to the National Research Council: Professor Otto Klineberg 
of Columbia University and Professors Robert R. Sears and Kenneth W. 
Spence of the State University of Iowa; Representative to the Social 
Science Research Council: Professor Robert R. Sears of the State Uni- 
versity of Iowa.” 





The following news note has been received from Dr. Alice I. Bryan, 
Executive Secretary: 

“The American Association for Applied Psychology held its eighth 
annual meeting at the Hotel Statler in Cleveland, Ohio, on September 12, 
1944. Dr. Carl R. Rogers was elected president for the coming year. 
Other newly elected officers include Frank P. Bakes, Secretary of the 
Clinical Section; C. L. Shartle and Rensis Likert, Chairman and Secre- 
tary respectively of the Industrial Section; C. M. Louttit, Chairman, 
and William A. Hunt, Secretary, of the Military Section; Bertha M. 
Luckey, Chairman of the Board of Editors; and Lloyd N. Yepsen, Chair- 
man of the Board of Affiliates. 

“The Association voted unanimously to approve the proposed By- 
Laws for a reorganization of the American Psychological Association in 
which the five Sections of the A.A.A.P. will become charter Divisions. 
Members present at the business meeting authorized the incoming Board 
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of Governors to take the necessary steps leading to legal dissolution of 
the association after the new unified organization begins functioning in 
September, 1945. 

“The presidential address of Dr. A. T. Poffenberger, entitled Psy- 
chology: Academic and Professional, was read in his absence by the Ex- 
ecutive Secretary at a joint meeting with the American Psychological 
Association on September 11. Other joint sessions were also held with 
the A.P.A. and Section I of the A.A.A.8.” 


Pi Lambda Theta, National Association of Women in Education, 
announces two awards of $400 each, to be granted for significant research 
studies on professional problems of women. Studies to be judged must 
be submitted by July 1, 1945, to the Chairman of the Committee on 
Studies and Awards, May Seagoe, U.C.L.A., Los Angeles, California. 
One award was made last year to a group working under the direction 
of Dr. Virginia Lee Block, Director of Guidance of the Seattle Public 
Schools, for a research study entitled, ‘Women of the Pacific Northwest.”’ 


The Reading Clinic Staff of the School of Education, The Pennsyl- 
vania State College, State College, Pennsylvania, is sponsoring a Seminar 
on Reading Disabilities, January 29 io February 2, 1945 and a Con- 


ference on Reading Instructions, June 26 to June 29, 1945. Inquiries 
should be directed to the Director of the Reading Clinic, Dr. Emmett A. 
Betts. 


Dr. Frederick C. Thorne, Editor, announces the publication of a 
new quarterly journal to be called Journal of Clinical Psychology begin- 
ning in January, 1945. Each volume will comprise 400 to 500 pages. 
The annual subscription will be $4.00. Editorial offices will be Brandon 
State School, Brandon, Vermont. 





Book Reviews 


E. K. Strong, Jr. Vocational interests of men and women. California: 
Stanford University Press, 1943. Pp. xxix + 746. $6.50. 


For over two decades Dr. Strong has persevered in studying occupa- 
tional interests defined as “‘the sum total of many interests that bear in 
any way upon an occupational career.” His point of view has been 
largely empirical: there are sets of responses of liking and disliking which 
can be shown to differentiate members of specific occupational groups 
from workers in general. Possibly because he has made this field so 
thoroughly his own, possibly because the paths to its understanding are 
so tortuous and beset with the need for simultaneous grasp of complex 
interrelations, it is likely that psychologists are only vaguely aware of the 
full significance of the problem he has set himself and the contribution 
he has made to its solution. Now, after six years in preparation, Voca- 
tional Interests of Men and Women tells the story of that research. It is 
not easy reading, but it is essential reading for those who must understand 
the individual’s motivating forces, of which interest is a significant one. 

The book includes twenty-seven chapters grouped into seven parts. 
Part one, as a general introduction, discusses the nature of interests, 
their relation to guidance, and the scales that Strong has developed for 
their measurement. The last chapter in this part presents extensive 
data on similarity of responses between groups, rather than differences. 
Part two summarizes the correlational, factorial, and central tendency- 
variability data by which specific occupations can be differentiated from 
workers in general, and by which families of occupation with underlying 
similarities of interest can be derived. Part three discusses some of the 
correlates and possible determinants of occupational interests. These 
include: occupational level; sex differences in interests; interest maturity 
and age changes in interests; and personality and ability measures as 
related to occupational interests. 

Parts four, five, and six concern the uses of interest measurement in 
practical situations of guidance, selection, and differentiation of superior- 
inferior members of occupational or educational groups. Part six con- 
tains significant data on the effect of differing men-in-general groups as 
points of reference in establishing interest keys for occupations hitherto 
not well differentiated. 

Part seven contains miscellaneous information on statistical formulae, 
weighting schemes, size and composition of criterion groups, stability of 
responses, and interest test scores of various samples of subjects. 
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While this organization seems straightforward and logical, it has occa- 
sionally led to some confusion and duplication in specific topics. For 
example, the topic of classifying occupations into families appears in 
Chapters Three, Eight, Ten, Fourteen and Twenty-two. Parallel to 
this, it is disconcerting to find early references to tables and discussions 
occurring much later in the book, as in the case of the text and many 
footnotes in the chapters comprising Part one. The author, by virtue 
of the amount and scope of his empiric’ data, is in the embarrassing 
position of having planted so many trees that he may not recognize the 
size and shape of his forest. The task of reporting the sheer number of 
researches, with the ramifications of the various studies, precludes the 
broad generalizations and summary hypotheses one might wish to see. 
This criticism, however, should not be allowed to obscure the fact that 
in no other source can the research worker or clinician find so much that 
is provocative, essential, and significant in the field of interest measure- 
ment. The task of the reviewer and reader is merely made more difficult 
by the close attention required to encompass all that the author has 
discussed. 

Brief comments on the highlights of the book seem the most effective 
method of review. In the first chapter, the discussions of motivation vs. 
efficiency, interests as related to attitudes or personality factors, and 
success vs. satisfaction, with the indeterminate relation between interests 
and success, are basic statements of the behaviour under study. Chapter 
Six, describing similarities of interests, rather than differences, is an 
important point of departure for all data presented later. The extent 
of similarity among and between groups is often overlooked as a basic 
factor in the amount of differentiation that can be elicited by the meas- 
urement device. The two aspects of validation—group differentiation 
and follow-up of individuals subsequently assigned to given groups—are 
separated for discussion in Chapter Seven, as they must be if the group 
differences are to be considered primarily as the result of specific occu- 
pational experience or of significant developmental trends in the indi- 
vidual. Chapters Seven through Nine deal with the first aspect of 
validity—group differentiation. 

Chapters Ten through Fourteen explore possible causal and correla- 
tive factors for the demonstrated group differences. It is within this 
context that the evidence of the importance of the men-in-general group 
(as presented in Chapters Twenty-one and Twenty-two) should be ex- 
panded. It is here also that theories regarding the origin of interests 
may be seen as generalizations or hypotheses. 

Chapter Fifteen begins the consideration of the second aspect of 
validity—the individual’s assignment to a family of occupations on the 
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basis of interest measurement. The discussion deals with several aspects 
of permanence of measured interests. This is followed by data on two 
follow-up studies of college groups—one covering ten years and the other 
nine years—that tend to support the validating propositions set forth. 

Prediction of occupational success by interest measurement is the 
topic of Chapter Nineteen; the possibility that the criterion can be recast 
into measures of satisfaction and continuation rather than competitive 
productivity, is discussed. This possibility points to new research areas 
of analysis of job satisfaction and survival. Prediction of educational] 
success is discussed in the same general terms in Chapter Twenty. 

Chapters Twenty-one and Twenty-two contain significant and new 
empiric data on differentiation in terms of the characteristics of the 
men-in-general group. (Pages 702-719 of Chapter 27 should be included 
in this chapter since the data they contain are integral parts of the 
discussion.) These chapters give answers to earlier questions regarding 
the extent of interest differentiation that is possible among lower-level 
occupational groups, and therefore open up increased ranges of usefulness 
of interest measurement in high school and industry. 

These comments do not do full justice to the amount of material 
Dr. Strong has assembled, nor to the dispassionate and detached manner 
in which he presents it. One could almost wish he had overgeneralized 
his interpretations or been guilty of special pleading, so that the reviewer 
could denounce and defend with a fine show of critical skill. But the 
book almost defies criticism first because it is a genuine and long-needed 
contribution to an important area of research and second because it deals 
almost entirely with an overwhelming mass of empirically derived data 
that speaks for itself over the entire range of problems in interest meas- 
urement. If the research that will follow after this volume is not sig- 
nificant and productive, it will be only because research workers have 
failed to study and understand the yeoman service Dr. Strong has 
performed for applied psychology. 


Lt. (jg) John G. Darley, USNR. 
Bureau of Medicine and Surgery 
Department of the Navy 
Washington, D. C. 


Cantril, Hadley. Gauging public opinion. Princeton University Press, 

1944. Pp. xiv+ 315. $3.75. 

Public opinion research has developed practically from ‘“‘scratch’’ 
within the last decade. The present book is timely in that it critically 
and empirically surveys this rapidly growing field. 

The book is divided into five sections. The first of these deals with 
the actual questions employed in a survey—their wording, their incon- 
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sistencies, and the possibility of measuring the intensity of opinion. On 
this last point we note the comparative value of different methods of 
estimating intensity such as self-rating or graphic scales. In this section 
as in all the others, the author reports special investigations made with 
reference to the questions raised or analyzes existing data with reference 
to the problem. 

Section Two discusses personal interviewing problems—here again 
from an empirical standpoint. Most of these studies seem sound, but 
it does appear rather silly to have an interviewer carry with him a box 
with a conspicuous padlock and a label “Secret Ballot.’’ The reviewer, 
at least, would be suspicious that when the interviewer got around the 
corner, he would unlock the box. A study was made of results of trained 
and untrained interviewers with their data evaluated by a small number 
of experts. The differences were rather small, but it is not entirely clear 
just how extensive the actual training was. It consisted mainly of a 
representative who went into the field and certainly was not a “course” 
on interviewing. There still seems to be a place for intensive training of 
interviewers, perhaps under academic auspices. The reviewer knows of 
at least one agency which sent its interviewers to school at a university 
for a few weeks. Fairly high reliabilities are reported where the same 
interviewer does the job twice or two interviewers work the same client. 
There are disquieting empirical indications of interviewer bias. Sound 
suggestions are made to select interviewers equally representing the 
biases inasmuch as they can’t be eliminated and also to do such things 
as having outsiders rather than local people working the small towns. 

Part Three deals with sampling. It discusses the precision (ade- 
quacy) and the accuracy (representativeness) and in some detail the 
stratified method of selecting the sample. Studies are reported in which 
actual samples were compared with census figures and the worst bias is 
found for education, that is, the tendency to get the more articulate 
respondents. This shows that it is pretty difficult to do a good job if 
you leave the final selection to the interviewer. Some investigation of 
small samples shows that they are pretty good for a quick, superficial 
sounding of opinion, but the authors are commendably conservative 
to the effect that there are no substitutes for a large adequate sample. 

Part Four gets at the determinants of opinion. Here we are getting 
beyond the mere survey of opinion and trying to find what is back of it, 
which naturally is of interest to the social psychologist. This involves 
breaking down the data on various variables, and it develops that we 
should plan the questions in advance with reference to this subsequent 
breakdown rather than doing the breakdown ex post facto. Procedures 
are developed for getting at the comparative importance of different 








536 Book Reviews 


factors in determining opinion. This involves the derivation of an arbi- 
trary “index of importance.” It would help if the author explained in 
a bit more detail the logic back of this particular index. It is probable 
that after the statisticians get at it, we shall have a series of such indices. 
The author demonstrates in many instances that groups with different 
backgrounds of information on a particular topic have wide differences 
of opinion about it. Trends over a period of time are presented graph- 
ically, and the curves are more convincing than the frequently published 
lists of percentages. At the end of this section the author derives some 
“laws’’—seventeen of them; for example, public opinion is sensitive to 
important events and it is basically determined by self-interest. This all 
seems sound enough and trends in the curves are cited to support each 
of these laws. 

The foregoing constitutes the major discussion of the book, but there 
is a fifth section which applies the techniques to a specific problem, 
namely, the measurement of civilian morale. This is analyzed into 
eleven items (later revised to sixteen) such as awareness of the objective 
and confidence in the leaders. Questions are formulated touching on 
each of these and data on a large number of interviews analyzed. The 
intercorrelations between these components averaged .21 and ranged 
from 0 to .60. Agreement with the objective and determination to 
achieve the objective had the highest correlation with the other items. 
These various components were validated by including an item as to the 
respondent’s participation in the war effort and using that as a criterion. 
This is a rather clever approach. Civilian morale is finally brought down 
to three dimensions, namely, determination to achieve the objective, 
confidence in the leaders, and satisfaction with traditional values. This 
is an interesting chapter for the social psychologist. 

There are several appendices. One gives more technical details re- 
garding the study of civilian morale; another deals with methods for 
correcting for interviewer bias; there are technical notes regarding sam- 
pling and breakdown, nomographs for computing critical ratios, some 
maps used in sampling, and a bibliography. 

The book is not all by the one author; sometimes a chapter is the 
work of one or two, and these are mentioned in footnotes. However, 
with this miscellaneous set of authors we do not have as much discon- 
tinuity as is usual under those circumstances. This is partly due to the 
fact that the approach is so largely empirical throughout. There is a 
commendable critical point of view and an experimental attitude. The 
authors of the various chapters differ somewhat in their caution in draw- 
ing conclusions from a limited study with reference to the larger problem. 
On the whole, it is commendably conservative; there is a pretty careful 
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check of most of the variables involved in any particular investigation. 
One gets the impression that we have here a real science that is develop- 
ing and that a lot of research is involved already, and much more remains 
to be done. The larger portion of the book deals with mere securing of 
accurate information about opinions and a smaller portion with attempts 
to get behind the causes for this opinion. This is about the proportion 
in which we may need to work for a little while into the future. As our 
techniques become more thorough and fool-proof we can spend a greater 
part of our time in attempting to analyze what is back of public opinion, 
and then, of course, take the next step and attempt to influence it and 
see to what extent our efforts have been successful. 

It is perhaps fortunate from the standpoint of the development of 
this new scientific field that it came along about the time so many things 
were happening in the world that might change opinion. In peace times 
differences between groups and changes in opinion would be smaller and 
careful analysis more difficult. It occurs to the reviewer that from now 
on historians are going to have a new job, namely, keeping track of public 
opinion as well as of the course of events. 

The book is definitely a contribution to this new field. It will be a 
“must” for persons who are going to work systematically in this field. 
It could be a textbook in courses for training public opinion interviewers, 
and it will be of interest to the social psychologist who perforce, is inter- 
ested in how the public feels about things. 


Harold E. Burtt 
Ohio State University 


Doherty, William B., and Runes, Dagobert D., Editors. Rehabilitation 
of the war injured. New York: Philosophical Library, 1943. Pp. 684. 
$10.00. 


This book is a collection of 53 papers plus a discussion symposium 
by various authors on problems in the rehabilitation of the war injured. 
The contributions are grouped under the following headings: neurology 
and psychiatry, reconstructive and plastic surgery, orthopedics, physio- 
therapy, occupational therapy and vocational guidance, legal aspects of 
rehabilitation, and two papers on neurologic and vascular lesions in 
survivors of shipwreck. The subject matter is predominantly medical, 
discussed largely from the medical standpoint, and, with the exceptions 
of the sections on neurology and psychiatry and occupational therapy 
and vocational guidance, probably of more general interest to medical 
men than to psychologists. 

The section on neurology and psychiatry should be of particular 
interest to clinical psychologists inasmuch as it includes some discussion 
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of the problems of brain injuries and their rehabilitation, speech disorders. 
psychologic reactions to injury, and malingering. Though the prog- 
nostic importance of posttraumatic intellectual defect is stressed, relevant 
psychologic techniques for assaying this condition receive inadequate 
treatment. In general, the contribution of the clinical psychologist to 
neuropsychiatric diagnosis, management, and treatment is distinctly 
underemphasized if not almost entirely neglected. 

The section on occupational therapy and vocational guidance con- 
tains a number of papers which should be of special interest to psycholo- 
gists. Psychologic vocational guidance techniques and their potential 
contributions to solution of vocational rehabilitation problems are not 
adequately emphasized, however. 

Throughout the book, the various authors show sporadic recognition 
of the importance of psychologic factors in the recovery and rehabilita- 
tion of the injured man. This complicated and difficult aspect of medi- 
cine, however, is all too frequently dismissed with a few rather vague 
sentences concerning its importance. This is to be regretted since post- 
graduate education of medical practitioners along this line might well be 
one of the aims of a book of this sort. 

Howard F. Hunt 


University of Minnesota 


Hamalainen, Arthur E. An appraisal of anecdotal records. New York: 
Bureau of Publications, Teachers College, Columbia University, 1943. 
Pp. 87. $1.85. 


Mr. Hamalainen has written one of the best analyses of anecdotal 
records yet published. He gives the facts with great scrupulousness and 
he gives his authorities so that the reader can make his own judgment 
of their reliability. The book is not merely useful; it is readable. It 
has the rare merit of being impartial. A moderately complex experiment 
is reported in which statistical techniques are employed to advantage. 
Among the conclusions drawn is the following: ‘‘The anecdotes are often 
as much a reflection of the teacher’s outlook as they are of the child’s 
behavior.” This supports the generally accepted dictum that no instru- 
ment for use in guidance has merit in its own right; its effectiveness is 
conditioned upon the skill of the operator. 

The anecdotal record, it appears, has complementary advantages if 
used under optimum conditions—a device for guidance of the pupil, and 


a device for evaluation of the teacher. 
F. S. Beers 


State Technical Advisory Service 
Social Security Board 
Washington, D. C. 
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New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be 
sent to Donald G. Paterson, Editor, Department of Psychology, 
University of Minnesota, Minneapolis 14, Minnesota 


The troubled mind. Charles S. Bluemel. Baltimore 2: The Williams & 
Wilkins Co., 1944. Pp. 523. $3.50. 

A guide for public opinion polls. George Gallup. New Jersey: Prince- 
ton University Press, 1944. Pp. 80. $1.50. 

Absenteeism—let’s solve it the right way. Ray E. Hibbs. Minneapolis 1: 
North Star Woolen Mill Co., March, 1944. Pp. 23. Free. 

Labor turnover. Ray E. Hibbs. Minneapolis 1: North Star Woolen 
Mill Co., August, 1944. Pp. 30. Free. 

Psychiatry for nurses. Karnosh and Gage. St. Louis: C. V. Mosby Co., 
1944. (Second edition) Pp. 339. $2.75. 

Rebel without a cause: the hypnoanalystis of a criminal psychopath. Robert 
M. Lindner. New York: Grune & Stratton, Inc., 1944. Pp. 310. 
$4.00. 

Light, vision and seeing. Matthew Luckiesh. New York: D. Van Nos- 
trand Co., Inc., 1944. Pp. xiv + 323. $4.50. 

Three friends. Elizabeth Montgomery and Dorothy Baruch. Spring- 
field: Scott, Foresman & Company, 1944. Pp. 160. $.84, list. 

Mental catharsis and the psychodrama. J. L. Moreno. New York 17: 
Beacon House, Inc., 1944. $1.25. 

Psychodramatic shock therapy. J. L. Moreno. New York 17: Beacon 
House, Inc., 1944. Pp. 30. $1.25. 

Psychodramatic treatment of performance neurosis. J. L. Moreno. New 
York 17: Beacon House, Inc., 1944. Pp. 31. $1.50. 

Sociodrama. J. L. Moreno. New York 17: Beacon House, Inc., 1944. 
Pp. 16. $1.25. 

Vocational guidance of the disabled soldier. W. M. O’Neil and J. P. 
Young. A. H. Pettifer, Acting Government Printer, Phillip-street, 
Sydney, New South Wales, 1943. Pp. 28. 

Supplementary guide for the revised Stanford-Binet scale (Form L). Ru- 
dolf Pintner, Anna Dragositz, and Rose Kushner. Stanford Univer- 
sity: Stanford University Press, 1944. Applied Psychology Mono- 
graphs Series No. 3. Pp. 135. Paper cover, $1.50. Cloth cover, 
$2.25. 

Guidance and personnel services in education. Anna Y. Reed. Ithaca: 
Cornell University Press, 1944. Pp. 496. $4.75. 
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A dictionary of international slurs. A. A. Roback. Cambridge: Sci-Art 
Publishers, 1944. $6.25. 

Social and emotional adjustments of regularly promoted and non-promoted 
pupils. Adolph A. Sandin. New York: Bureau of Publications, 
Teachers College, Columbia University, 1944. Pp. 142. $2.15. 
Child Development Monograph No. 32. 

Psychiatry and the war. Edited by Frank J. Sladen. Springfield: 
Charles C. Thomas, 1944. Pp. 464. $5.00. 

Home economics in junior colleges. Ivol Spafford and others. Minne- 
apolis: The Burgess Publishing Company, 1944. Pp. 84. $1.50. 
Today’s handbook for librarians. Mary A. Sweeney. Chicago: The 

American Library Association, 1944. Pp. 100. $.75. 

On growth and form. Sir D’Arcy W. Thompson. New York: The Mac- 
millan Company, 1944. (Revised edition) $12.50. 

Evaluation in teacher education. Maurice E. Troyer and C. Robert Pace. 
Prepared for the Commission on Teacher Education of the American 
Council on Education, 744 Jackson Place, Washington 6, D.C. Pp. 
369. $3.00. 

Measurement of adult intelligence. David Wechsler. Baltimore 2: The 
Williams & Wilkins Company, 1944. (Third edition) $3.50. 

Training and reference manual for job analysis. War Manpower Com- 
mission, Bureau of Manpower Utilization, Division of Occupational 
Analysis and Manning Tables. Washington, D. C.: Government 
Printing Office, 1944. Pp. 104. $.20. 














AMERICAN PSYCHOLOGICAL PERIODICALS 





American Journal of Psychology—Ithaca, N. Y.; Cornell U eer Subscription $6.50. 624 pages annually 
Edited by K. M. Dallenbach, Madison Bentley, and E. G. Boring. Quarterly. General and experi. 
mental psychology. Founded 1887. : 


Journal of Genetic Psychology—Provincetown, Mass.; The Journal Press. Subscription $14.00 per 
(2 volumes). 1000 pages annually. Edited by Carl Murchison. Quarterly. Child behavior, anima} 
behavior, and comparative psychology. Founded 1891. , 

Psychological Review—Northwestern University, Evanston, Illinois; American Psychological Association, In; 
Subscription $5.50. 540 pages annually. Edited by Herbert S. Langfeld. Bi-monthly. Genera} 
psychology. Founded 1894. <i 


Psychological Monographs—N orthwestern University, Evanston, Illinois; American Psychological Associa 
Inc. Subscription $6.00 per volume. 500 pages. Edited by John F. Dashiell. Without fixed 0 es, 
each issue one or more researches. Founded 1895. 


Psychological Bulletin—Northwestern University, Evanston, Illinois; saggy be om Psychological Association, In 
Subscription $7.00. 665 pages annually. Edited by John E. Anderson. Monthly (10 numbers). 
Psychological literature. Founded 1904 
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